GPT-4 became more stupid and was exposed to cache history reply: a joke was told eight hundred times, but the new one didn't listen to it. 04/19 Update SLTechnology News&Howtos

GPT-4 became more stupid and was exposed to cache history reply: a joke was told eight hundred times, but the new one didn't listen to it.

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Thanks CTOnews.com netizen Hua Ke high achiever's clue delivery! Some netizens have found another evidence that GPT-4 has become "stupid".

He questioned:

OpenAI caches historical responses and asks GPT-4 to retell previously generated answers directly.

The most obvious example is telling jokes.

The evidence shows that even if he increases the temperature of the model, GPT-4 still repeats the same "scientist and atom" answer.

It's the bad joke about why scientists don't trust atoms? because everything is made up / make up by them.

Here, the higher the temperature value, the easier it is for the model to generate unexpected words and should not repeat the same joke.

Not only that, even if we do not change the parameters, change the wording and emphasize that it should tell a new and different joke, it will not help.

The discoverer said:

This shows that GPT-4 not only uses caching, but also clusters queries rather than precisely matching a question.

This benefit is self-evident, and the response speed can be faster.

However, since the high price to buy members, enjoy only such a cache retrieval service, no one is happy.

Some people's feelings after reading it are as follows:

If that's the case, is it unfair for us to use GPT-4 to evaluate the answers of other big models?

Of course, there are people who don't think that this is the result of external caching, and maybe the answer of the model itself is so repetitive:

Previous studies have shown that ChatGPT repeats the same 25 jokes 90% of the time.

What do you say exactly?

The proof that GPT-4 replied with cache not only ignored the temperature value, the netizen also found that:

It's no use changing the top_p value of the model, so GPT-4 follows that joke.

(top_p: used to control the authenticity of the results returned by the model, lower the value if you want more accurate and fact-based answers, and increase if you want a variety of answers)

The only way to crack it is to increase the randomness parameter n so that we can get the "non-cached" answer and get a new joke.

However, its "price" is that the response speed is slower, after all, there will be some delay in generating new content.

It is worth mentioning that others seem to have found a similar phenomenon in local models.

Someone said that the "prefix-match hit" (prefix matching hit) in the screenshot seems to prove that the cache is indeed used.

So the question is, how exactly does the big model cache our chat messages?

Good question. From the second example shown at the beginning, it is obvious that some kind of "clustering" operation has been carried out, but we do not know exactly how to apply it to deep multi-round conversations.

Regardless of this question, some people see here and suddenly realize ChatGPT's statement that "your data is here, but once the chat is over, the content of the conversation will be deleted."

This makes some people start to worry about data security:

Does this mean that the chat we initiated is still saved in their database?

Of course, some people analyze that this concern may be overdone:

Maybe it's just that our query embedding and answer cache are saved.

So, as the discoverer himself said:

I'm not too worried about caching itself.

What I am worried about is that OpenAI simply and rudely summarizes our questions to answer, does not care about settings such as temperature, and directly aggregates hints with different meanings, which has a very bad impact and may "scrap" many (GPT-4-based) applications.

Of course, not everyone agrees that the above findings prove that OpenAI is really using caching.

Their reason is that the case used by the author happens to be telling jokes.

After all, in June of this year, two German scholars tested and found that when ChatGPT was asked to tell a random joke, 90% of the 1008 results were variations of the same 25 jokes.

Such as "scientists and atoms" is especially the most frequent, it has been said 119 times.

So you can understand why it looks like the previous answer is cached.

Therefore, some netizens also propose to test and see with other types of problems.

However, the author insists that the problem does not have to be changed, and that it is easy to tell whether it is a cache by measuring the delay time alone.

Finally, we might as well look at this issue from another perspective:

GPT-4 keeps telling a joke. Why?

Haven't we always emphasized the need for large models to output consistent and reliable answers? No, it's so obedient (manual dog head).

So, whether or not GPT-4 is cached, have you observed a similar phenomenon?

Reference link:

Https://twitter.com/hammer_mt/status/1719150885559812379

This article is from the official account of Wechat: quantum bit (ID:QbitAI), author: Fengcai

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.