It's so noisy, does ChatGPT know the language in the end? PNAS: let's first study what "understanding" is. 04/27 Update SLTechnology News&Howtos

It's so noisy, does ChatGPT know the language in the end? PNAS: let's first study what "understanding" is.

2025-04-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Does the language model understand human language? Who is for it and who is against it?

Whether a machine can think about this question is like asking if a submarine can swim. -- Dijkstra

Even before the release of ChatGPT, the industry smelled the change brought about by the big model.

On October 14 last year, Melanie Mitchell and David C. Krakauer, professors at the Santa Fe Institute (Santa Fe Institute), published a review on arXiv, comprehensively investigating all the arguments about whether large-scale pre-trained language models can understand language, describing the "positive" and "negative" arguments, as well as the key issues of the broader intellectual science derived from these arguments.

Links to papers: https://arxiv.org/ pdf / 2210.13966.pdf

Published journal: Proceedings of the National Academy of Sciences (PNAS)

It is too long to read the version:

The main argument in support of "understanding" is that large language models can accomplish many tasks that seem to need to be understood.

The main argument against "understanding" is that, from a human point of view, the understanding of large language models is very fragile, such as the inability to understand subtle changes between prompt; and language models do not have real-world life experience to verify their knowledge, and multimodal language models may alleviate this problem.

The key problem is that no one has a reliable definition of "what is understanding" and does not know how to test the comprehension of language models. Testing for humans is not necessarily suitable for testing the comprehension of large language models.

In short, large language models can understand language, but perhaps in a different way than human beings.

Researchers believe that we can develop a new intellectual science, in-depth study of different types of understanding, find out the advantages and limitations of different modes of understanding, and integrate the cognitive differences caused by different forms of understanding.

The lead author of the thesis, Melanie Mitchell, is a professor at the Santa Fe Institute. she graduated from the University of Michigan in 1990 under the mentors of Hou Shida (author of Godel, Asher, Bach: a collection of great achievements) and John Holland. her main research interests are analogical reasoning, complex systems, genetic algorithms and cellular automata.

What exactly is "understanding"? "what is understanding" has always perplexed philosophers, cognitive scientists and educators, and researchers often use humans or other animals as the reference of "understanding ability".

Until recently, with the rise of large-scale artificial intelligence systems, especially the emergence of large-scale language models (LLM), there has been a heated debate in the artificial intelligence community, that is, whether it is now possible to say that machines have been able to understand natural language, thus understanding the physical and social situations described by language.

This is not a mere academic debate. The extent and way to which machines understand the world is about the extent to which humans can trust AI to carry out tasks such as driving cars, diagnosing diseases, caring for the elderly, educating children, and so on, in order to take strong and transparent action in human-related tasks.

The current debate shows that there are some differences in academic circles on how to think about understanding in intelligent systems, especially in mental models (mental models) that rely on "statistical correlation" and "causal mechanisms".

However, there is still a general consensus on machine understanding in the artificial intelligence research community, that is, although artificial intelligence systems exhibit seemingly intelligent behavior in many specific tasks, they do not understand the data they process as humans do.

For example, facial recognition software does not understand that the face is a part of the body, nor does it understand the role of facial expressions in social interaction, let alone how humans use facial concepts in almost infinite ways.

Similarly, speech-to-text and machine translation programs do not understand the language they deal with, and autopilot systems do not understand the subtle eye contact or body language that drivers and pedestrians use to avoid accidents.

In fact, the brittleness (brittleness) often mentioned in these artificial intelligence systems, that is, unpredictable errors and lack of robust generalization ability, is the key index to evaluate the comprehension of AI.

In the past few years, the large language model (LLMs) has grown in popularity and influence in the field of artificial intelligence, which has also changed some people's view of the future of machine understanding of language.

Large-scale pre-training model, also known as basic model (Foundation Models), is a deep neural network with billions to trillions of parameters (weights), which is obtained after "pre-training" on a massive natural language corpus (including online texts, online books, etc.).

The task of the model during the training period is to predict the missing parts of the input sentence, so this method is also called "self-supervised learning". The resulting network is a complex statistical model. you can find out how words and phrases in the training data are related to each other.

This model can be used to generate natural language and fine-tune for specific natural language tasks, or further trained to better match "user intentions", but for non-professionals and scientists, exactly how the language model accomplishes these tasks remains a mystery.

The internal operation of neural networks is largely opaque, and even the researchers who build these networks have limited intuition about systems of this scale.

Neuroscientist Terrence Sejnowski describes the emergence ability of LLMs (emergence):

After breaking through a certain threshold, it is as if aliens suddenly appear and can communicate with us in a terrible, human way. At present, only one thing is clear: large language models are not human, and some aspects of their behavior seem to be intelligent, but if not human intelligence, what is the nature of their intelligence?

Pro-understanding VS versus understanding despite the impressive performance of large language models, the most advanced LLMs is still vulnerable to brittleness and non-human errors.

However, it can be observed that the network performance improves significantly with the expansion of the number of parameters and the size of the training corpus, which leads some researchers in this field to claim that as long as there are sufficiently large network and training data sets, the language model (multimodal version) may be a multimodal version-which will lead to human-level intelligence and understanding.

A new artificial intelligence slogan has emerged: the only thing to do is to scale up the model (Scale is all you need)!

This statement also reflects the debate in the artificial intelligence research community about large-scale language models:

One school believes that language models can really understand language and reason in a general way (although it has not yet reached the human level).

Google's LaMDA system, for example, is pre-trained in text and then fine-tuned on dialogue tasks to talk to users over a wide range of areas.

Another school believes that large pre-training models such as GPT-3 or LaMDA, no matter how fluent their language output, cannot have understanding, because these models have no practical experience and no mental model of the world.

Language models are only trained in predicting words in a large set of texts to learn the form of the language, far from learning the meaning behind the language.

A system trained by language alone will never be close to human wisdom, even from now on until the demise of the universe. It is clear that these systems are destined to be only superficial understanding and can never get close to the wholehearted thinking we see in human beings.

Another scholar believes that intelligence, agents and the broader understanding of these systems are wrong, and that language models are actually compressed repositories of human knowledge, more like libraries or encyclopedias than agents.

For example, humans know what "itch" means to make us laugh because we have bodies; language models can use the word "itch", but it obviously does not feel this way, understanding that "itch" is to map one word to one sensation, not to another.

Those on the "LLMs without comprehension" side argue that while the fluency of large language models is surprising, our surprise reflects our lack of intuition about what statistical correlations can generate on the scale of these models.

A survey of active researchers in the natural language processing community in 2022 showed clear differences in this debate.

When asked whether the 480 respondents agreed with the statement about whether LLMs can understand the language in principle, that is, "A generative language model that only trains text, as long as there are sufficient data and computing resources, it can understand natural language in a sense."

The survey results were fifty-fifty, half (51%) agreed and the other half (49%) disagreed.

Machine understanding is different from human understanding although both sides of the debate on "LLM comprehension" have sufficient intuition to support their views, but the current cognitive science-based methods that can be used for in-depth understanding are not sufficient to answer such questions about LLM.

In fact, some researchers have applied psychological tests (originally to assess human understanding and reasoning mechanisms) to LLMs and found that in some cases, LLMs does show human-like responses in thought-theory tests and human-like abilities and biases in reasoning assessments.

Although these tests are considered to be reliable agents for evaluating human generalization capabilities, this may not be the case for artificial intelligence systems.

Large language models have a special ability to learn the correlation between their training data and the token in their input, and can use this correlation to solve problems; instead, humans use compression concepts that reflect their real-world experience.

When tests designed for humans are applied to LLMs, the interpretation of the results may depend on assumptions about human cognition, which may not be true to these models at all.

To make progress, scientists will need to develop new benchmarks and detection methods to understand different types of intelligence and understanding mechanisms, including the new forms of "strange, mind-like entities" we have created, and some work has been done.

As models become larger and more capable systems are developed, the debate about comprehension in LLMs emphasizes the need to "expand our intelligent science" in order to make "understanding" meaningful, both for humans and machines.

Neuroscientist Terrence Sejnowski pointed out that experts' different opinions on LLMs intelligence show that our old ideas based on natural intelligence are not enough.

If LLMs and related models can succeed by taking advantage of statistical correlation on an unprecedented scale, it may be thought of as a "new form of understanding", a form that can achieve extraordinary and superhuman predictive power, such as DeepMind's AlphaZero and AlphaFold systems, which bring an "exotic" form of intuition to chess and protein structure prediction, respectively.

So it can be said that in recent years, the field of artificial intelligence has created machines with new modes of understanding, which is probably a new concept, as we make progress in pursuing the elusive nature of intelligence, these new concepts will continue to be enriched.

Problems that require a lot of coding knowledge and high performance requirements will continue to promote the development of large-scale statistical models, while those with limited knowledge and strong causal mechanisms will help to understand human intelligence.

The challenge for the future is to develop new scientific methods to reveal detailed understanding mechanisms for different forms of intelligence, to identify their strengths and limitations, and to learn how to integrate these really different cognitive models.

Reference:

Https://www.pnas.org/doi/10.1073/pnas.2215907120

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.