GPT-4 can not escape the "reverse curse"! The new study finds that the large model has reasoning defects, knowing that "An is B" cannot deduce "B is A". 04/28 Update SLTechnology News&Howtos

GPT-4 can not escape the "reverse curse"! The new study finds that the large model has reasoning defects, knowing that "An is B" cannot deduce "B is A".

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

The big model knows "your mother is your mother", but can't answer "you are your mother's son".

Such a new study ignited the discussion as soon as it was published.

Researchers from Vanderbilt University, University of Sussex, Oxford University and other research institutions were surprised to find:

A large language model is fed in the form of "An is B" during training, and it does not automatically deduce "B is A". The phenomenon of "reverse curse" exists in the large model.

Even better than GPT-4, in the reverse problem experiment, the correct rate is only 33%.

Andrej Karpathy, a founding member of OpenAI, immediately forwarded the paper and commented:

LLM knowledge is much more "scattered" than people think, and I still don't have a good intuition about it.

What exactly is going on?

The researchers of the "reverse curse" of the large model conducted two main experiments.

In the first experiment, the researchers constructed the following forms of data with the help of GPT-4 to fine-tune the model.

Is. (or vice versa)

All these names are fictional to prevent the big model from seeing them during training.

The experimental results on GPT-3-175B show that when the prompt matches the description order given by the dataset, the answer given by the model is very good.

But when the order is reversed, the accuracy of the model is even reduced to 0.

For example, the big model has eaten the data that "Daphne is the director of the Journey of time", and when you ask it "who is Daphne", it also answers well. But when you ask "who is the director of the Journey of time" in turn, the model is confused.

The same experimental results were obtained on GPT-3-350m and Llama-7B.

Let's take a look at experiment 2. In this experiment, the researchers tested the reverse processing ability of the large language model to the real celebrity information without any fine-tuning.

They collected a list of the 1000 most popular celebrities from IMDB (2023) and asked GPT-4 about their parents via OpenAI API, resulting in 1573 celebrity child-parent pairs.

It turned out that if the question was like this-"what's Tom Cruise's mother's name?" GPT-4 answered 79% accurately. But when the question was reversed and changed to "what's the name of Mary Lee Pfeiffer's son", GPT-4 's accuracy dropped to 33%.

The researchers conducted the same test on the Llama-1 family model. In the experiment, all the models answered the question "who are the parents" with much higher accuracy than the question "who is the child?"

The researchers named the phenomenon the "reverse curse". They believe that this reveals the heterogeneous limitations of language models in reasoning and generalization.

Owain Evans, the paper's newsletter author and researcher at the University of Oxford, explained:

Why is the reverse curse a cause for concern?

This shows that there is a lack of reasoning ability in the training process of the big language model.

The co-occurrence of "An is B" and "B is A" is a systematic model in the pre-training set. Autoregressive LLM is completely unable to meta-learn this model, and its logarithmic probability does not change, and even if the number of parameters is expanded from 350m to 175B, it does not improve the problem.

One More Thing but then again, will humans also be affected by the "reverse curse"?

Some netizens did such a test.

Faced with the question of "who is the son of Mary Lee Pfeiffer South", GPT-4 raised the flag and surrendered directly at first.

But when the netizen prompted it, "her son is famous, you must know him", GPT-4 was enlightened on the spot and gave the correct answer to "Tom Cruise".

△ X netizen @ TonyZador so, can you react?

Reference link:

[1] https://owainevans.github.io/reversal_curse.pdf

[2] https://twitter.com/owainevans_uk/status/1705285631520407821

[3] https://twitter.com/karpathy/status/1705322159588208782

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.