Foreign Java engineers have proved that GPT-4 can't solve logic puzzles, but it does have reasoning ability. 04/16 Update SLTechnology News&Howtos

Foreign Java engineers have proved that GPT-4 can't solve logic puzzles, but it does have reasoning ability.

2025-04-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Johan LAJILI, a senior software engineer at IMG Arena, believes that when LLM can understand concepts and pass the Turing test, we should recognize its reasoning ability.

Does GPT-4 or LLM have the ability to reason? This is a controversial issue that has existed for a long time.

Some people think that LLM only gets a universal approximate retrieval through a lot of text training and does not have the real reasoning ability.

However, there are a large number of papers that claim that LLM performs well in many reasoning tasks.

Now, Johan LAJILI, a senior software engineer from IMG Arena, has posted an article on his blog, firmly supporting LLM's "intelligence", "reasoning" and "logic" capabilities.

Moreover, in the face of many existing doubts about the reasoning ability of LLM, Johan also gives a quite detailed explanation.

Blog address: https://lajili.com/ posts / post-3/

So let's take a look at how Johan proves that LLM is capable of reasoning.

LLM is just a "word dragon"? "LLM is just a model for predicting the next word" is the main argument against the reasoning ability of LLM.

This view usually comes from people who are proficient in technology or artificial intelligence, and in fact, it is also true.

When working, GPT-4 can only predict one word at a time (or, more specifically, a token). When the user gives it a hint or a piece of text to fill in, it uses its neural network to find the word that is most likely to follow.

However, comparing LLM's algorithm to the word suggestion algorithm on a smartphone keyboard is quite short-sighted.

In fact, in order to accurately predict meaningful sentences, GPT-4 must have an internal way of expressing concepts, such as "object", "time", "family" and everything else that can be expressed.

This is not only to find a word associated with the previous word, LLM also needs to understand the meaning of these words in order to accurately answer the user's question.

LLM's understanding of the concept is established through large-scale training.

Through this process, it can be proved that LLM has the concept of "concepts", that is, they can represent things in the physical world and the interactions between them.

This means that GPT-4 can not only predict the next word, but also understand higher-level semantic concepts, enabling it to produce coherent and meaningful texts.

But being able to understand "concepts" is not enough for reasoning, because reasoning also requires the ability to combine different concepts to solve problems.

LLM can't answer X puzzles and Logic questions with the development of artificial intelligence technology, the traditional Turing test, which allows humans to tell whether they are talking to themselves or not, lost its effectiveness after the birth of ChatGPT.

Now the Turing test has become more complex.

At the same time, companies that claim to be able to detect whether content is generated by artificial intelligence have emerged one after another, but these attempts have largely failed.

In addition, even professional linguists have a 50% chance of discerning the content generated by artificial intelligence.

The failure of these attempts to detect the content generated by artificial intelligence precisely proves that we no longer distinguish between the content generated by human and artificial intelligence.

Now the content generated by artificial intelligence is usually distinguished by some obvious signs, such as "according to my training before September 2021."this kind of expression.

But this is unfair to artificial intelligence.

If the only thing we can use to identify it is some of its own writing habits, then we have obviously reached the stage of admitting that its writing skills are similar to those of human beings.

Back to the question of whether LLM can reason and logic puzzles.

In his speech, Jeremy Howard explained very well how LLM reasoning.

In general, a good, systematic Prompt will have a huge impact on the results of GPT-4.

If the user can specify the background and logical steps of the problem, GPT-4 can usually solve these puzzles.

For example, researchers from Microsoft Research Asia, Peking University, Beihang and other institutions successfully led GPT-4 to the conclusion of "P ≠ NP" through 97 rounds of "Socratic" strict reasoning.

Https://arxiv.org/ abs / 2309.05689 unlike human beings, GPT-4 has no distinction between thinking and spoken language.

For human beings, solving a problem without thinking or subconsciously means that the problem is very simple, which is essentially answered by memory.

For example, when calculating 2x8, we would very quickly conclude that the answer is 16, and our brains don't think about it at all.

But if it's solving a complex math problem, or guessing a riddle, or a programming problem, we have to think about it in our minds before we answer the question.

And this is reasoning.

A more complex problem may require us to think about how to solve it before trying to solve it.

In this respect, GPT-4 is no different from human beings.

But GPT-4 's thought process can be seen as part of the response.

Maybe the future GPT-5 will have a "think" response, but it won't be displayed by default.

In fact, whether GPT-4 has the ability of reasoning only involves the question of cost and efficiency.

Just as there is not the same level of double-checking when estimating the cost of meals in restaurants or filing tax returns, it is inefficient to ask GPT-4 to demonstrate in detail every question raised by users.

Another classic problem with LLM's hallucinations and consciousness about LLM is that there are biases and hallucinations in these models.

This is indeed a thorny problem, but it doesn't mean that LLM can't reason.

For example, one cannot avoid prejudice. Some people will realize this, while others may never think about it.

Before modern times, people believed that the earth was the center of the universe and that air was "nothing".

But can we therefore conclude that people before modern times did not have the ability to reason?

Similarly, just because the model can go wrong doesn't mean the model can't reason.

Because correct or consistently correct is not the definition of reasoning, but the definition of omniscience.

But as to whether there is consciousness in GPT-4, my answer is no.

The existence of consciousness is a very philosophical problem, which depends to a certain extent on the individual's point of view.

But I think consciousness comes into being over a long period of time and needs to be taken care of by a "self".

Every time a user opens GPT-4 and chooses to start a conversation in a chat box, this is actually creating a whole new existence.

At the end of the conversation, this presence is either deleted. Or stay in a static state.

Lack of long-term memory, lack of emotion, and inability to respond spontaneously to external stimuli are all limiting factors that hinder the production of consciousness.

But we can also be optimistic that these problems will be solved in the future.

Maybe there is a group of smart people studying these problems right now.

Whether there is consciousness in GPT-4 is only a small part of the puzzle about consciousness.

Reference:

Https://lajili.com/posts/post-3/

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.