Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

GPT-4 is AGI! Google Stanford scientists reveal how the big model is super-intelligent

2025-02-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

[guide to Xin Zhiyuan] two experts from Google Research and Stanford HAI wrote that the most cutting-edge AI model will be regarded as the first generation AGI in the future. The cutting-edge LLM has proved with its powerful ability that AGI is coming!

General artificial Intelligence (AGI), in fact, has been realized?

Recently, bosses from Google Research and Stanford HAI wrote that today's big language model is the right direction to AGI, and now the most cutting-edge model already has the ability of AGI!

The authors are both big names in the AI industry. Blaise Ag ü era y Arcas is now a vice president and researcher at Google Research and used to work at Microsoft. The main research field is the basic research of artificial intelligence.

Peter Norvig is an American computer scientist, a researcher at the Stanford AI Institute, and the engineering director of Google Research.

The meaning of general artificial intelligence (AGI) in the eyes of different people is completely different.

The current state-of-the-art AI large language model has almost achieved most of the fantasies of AGI.

Although these "frontier models" have many flaws: they fabricate academic citations and court cases to expand human biases from training data, and simple math is wrong.

Nevertheless, today's cutting-edge models can even perform new tasks that they have not been trained to do, crossing the threshold that previous generations of artificial intelligence and supervised deep learning systems have never reached.

In a few decades, they will be recognized as the first examples of achieving AGI capabilities, just as it is now looking back at ENIAC in 1945, it is the first true general-purpose electronic computer.

Even today's computers far outperform ENIAC in terms of speed, memory, reliability, and ease of use. But ENIAC can be programmed with sequential instructions, cyclic instructions and conditional instructions, which gives it generality that its predecessors (such as differential analyzers) do not have.

Similarly, the future cutting-edge artificial intelligence will continue to improve on the basis of today.

But what about the key attributes of versatility?

It has been implemented on the real big language model.

What is general artificial intelligence? Although the early AI system can be close to or exceed the human level in the ability to perform tasks, it usually can only focus on a single task.

For example, MYCIN, developed by Ted Shortliffe of Stanford University in the 1970s, can only diagnose bacterial infections and make recommendations for treatment; SYSTRAN can only do machine translation; and IBM's Deep Blue can only play chess.

Later, the deep neural network models trained by supervised learning, such as AlexNet and AlphaGo, have successfully completed many machine perception and judgment tasks that can not be solved by early heuristic, rule-based or knowledge-based systems.

Recently, we have seen some cutting-edge models that can accomplish a variety of tasks without targeted training.

It can be said that these models realize the capabilities of general artificial intelligence in five important aspects:

-topic (Topic)

The frontier model is trained through hundreds of gigabytes of text that covers almost all the topics discussed on the Internet. Some of these models are also trained on a wide variety of audio, video and other media.

-Task (Task)

These models can perform a variety of tasks, including answering questions, generating stories, summarizing, transcribing voice, translating languages, interpreting, making decisions, providing customer support, invoking other services to perform operations, and combining text and images.

-Mode (Modalities)

The most popular models deal mainly with images and text, but some systems can also handle audio and video, and some are connected to robot sensors and actuators. By using a modal-specific word splitter or processing the original data stream, the frontier model can in principle handle any known sensory or motor mode.

-language (Language)

English accounts for the highest proportion of the training data in most systems, but the large model can be used for dialogue and translation in dozens of languages, even between language pairs with no examples in the training data. If the training data contains code, the model can even support "translation" between natural language and computer language (that is, general programming and reverse engineering).

-directability (Instructability)

These models can be used for "contextual learning", that is, learning based on tips rather than training data. In "small sample Learning", a new task will be equipped with several input / output examples, and then the system will give the corresponding output of the new input based on this. In Zero sample Learning, a new task is described, but no examples are given (for example, "write a poem about cats in Hemingway's style").

"Universal intelligence" must be considered in multiple dimensions, not from a single "yes / no" proposition.

Previously, weak artificial intelligence systems usually performed only a single or scheduled task and were explicitly trained for this purpose. Even multitasking learning can only produce weak intelligence because the model is still running within the scope of the task envisaged by the engineer. In fact, most of the arduous work involved in the development of weak artificial intelligence is related to the collation and annotation of specific task data sets.

In contrast, the frontier language model is competent for almost all tasks that human beings can accomplish, which can be put forward and answered in natural language and have quantifiable performance.

For general artificial intelligence, contextual learning is a significant task. Contextual learning extends the scope of the task from what is observed in the training corpus to everything that can be described. Therefore, the general artificial intelligence model can perform tasks that the designer has never imagined.

According to the daily meaning of the words "general" and "intelligence", cutting-edge models have actually reached a fairly high level in this respect.

So why is anyone reluctant to acknowledge the existence of AGI?

The main reasons are as follows:

1. Skeptical about AGI metrics

two。 Firmly believe in other theories or technologies of artificial intelligence

3. Cling to the particularity of human beings (or creatures)

4. Concerns about the economic impact of artificial intelligence

In fact, there are great differences on how to set the evaluation index of AGI as to where the threshold of general artificial intelligence (AGI) is. Many experts in the industry have tried to avoid using the word completely.

For example, Mustafa Suleyman, co-founder of DeepMind, suggests using "artificial ability intelligence (Artificial Capable Intelligence)" to describe such a system.

He suggests using the "Hyundai Turing Test" to measure the AI system's ability to quickly make $1 million online on the basis of $100000 in start-up capital.

Although it seems debatable to equate "ability" with "making money" directly, the AI system that can directly generate wealth will certainly affect the world on a more far-reaching level.

Of course, the public has every reason to be sceptical about certain indicators.

For example, when a person passes a complex legal, business or medical exam, the public will assume that the person can not only accurately answer the questions in the exam, but also solve a series of related problems and complex tasks.

Naturally, there is no doubt that this person will have the general ability of ordinary human beings.

LLM can take an exam, but can't be a doctor.

However, when training cutting-edge large language models to pass these exams, the training process is usually adjusted for the exact type of questions in the test.

Although models can pass these qualification exams, current cutting-edge models are certainly not qualified for the work of lawyers or doctors.

As Goodhart's law says, "when a measure becomes a target, it is no longer a good measure." "

The entire AI industry needs better testing to assess the capabilities of models, and good progress has been made, such as Stanford's model evaluation system, HELM.

Test set address: https://crfm.stanford.edu/ helm / latest/

Fluency of speech = high intelligence?

Another very important issue is not to confuse language fluency with intelligence.

Previous generations of chatbots, such as Mitsuku (now known as Kuki), occasionally deceived human developers by suddenly changing the subject and repeating coherent text paragraphs.

At present, the most advanced models can generate responses in real time without relying on preset text, and they are better at grasping the theme of large amounts of text.

But these models still benefit from human natural assumptions. In other words, their fluent, grammatical answers still come from intelligent entities like humans.

We call it the Chauncey Gardner effect, named after the character in Being There (a satirical novel that was later adapted into a film)-Chauncey is respected and even admired simply because he "looks like" a person who should be respected and worshipped.

The sudden emergence of LLM capabilities

In their paper, researchers Rylan Schaeffer, Brando Miranda and Sanmi Koyejo pointed out another problem with common artificial intelligence capability indicators: the difficulty of evaluating indicators is not linear.

Paper address: https://arxiv.org/ pdf / 2304.15004.pdf

For example, for a test consisting of a series of five-digit arithmetic problems. It is almost impossible for small models to answer correctly, but with the continuous expansion of the size of the model, there will be a critical threshold after which the model will answer most of the questions correctly.

This phenomenon gives the impression that computing power is springing up from models that are large enough.

However, if the test set also includes one-to four-digit arithmetic questions, and if the scoring standard is changed to score as long as you can count some numbers correctly, you don't have to count all the numbers like humans to score.

We will find that: with the increase of the size of the model, the performance of the model is gradually improved, and does not suddenly appear a threshold.

This view questions the idea that super-intelligent abilities or attributes (which may include consciousness) may suddenly and mysteriously "emerge". And the "emergence theory" does cause a certain degree of panic among the public and even policy makers.

Similar arguments are used to "explain" why humans have intelligence, while other great apes do not.

In fact, this discontinuity of intelligence may also be illusory. As long as the criteria for measuring intelligence are accurate enough, you can basically see that intelligence is continuous-"the more, the better" rather than "the more, the different".

Why computer programming + linguistic ≠ AGI? In the history of the development of AGI, there are many competing intelligence theories, some of which have been recognized in certain fields.

Computer science itself is based on a programming language with a well-defined formal syntax and is closely related to "Good Old-Fashioned AI" (GOFAI) from the very beginning.

GOFAI's creed can be traced back at least to the 17th-century German mathematician Gottfried Gottfried Wilhelm Leibniz.

The physical symbol system hypothesis by Allen Newell (Allen Newell) and Sima he (Herbert Simon) further materializes this theory.

Article address: https://dl.acm.org/ doi / pdf / 10.1145xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx hypothesis that intelligence can be expressed by calculus, in which symbols represent thought, and thinking is formed by symbol transformation according to logical rules.

At first, natural languages like English seemed to be such a system:

Use symbols such as "chair" and "red" to represent concepts such as "chair" and "red".

The symbolic system can state-- "the chair is red"-- or it can lead to a logical inference: "if the chair is red, then the chair is not blue." "

While this view seems reasonable, systems built in this way are often fragile, and the functionality and versatility that can be achieved are limited.

There are two main problems: first, terms such as "blue", "red" and "chair" can only be vaguely defined, and ambiguity becomes more serious as the complexity of the task performed increases.

Second, such a logical inference is difficult to produce universally valid results, and the chair may indeed be blue or red.

More fundamentally, a large number of thinking and cognitive processes can not be simplified to the transformation of logical propositions.

This is the main reason why decades of efforts to combine computer programming with linguistics have failed to produce anything similar to general artificial intelligence.

However, some researchers who are particularly focused on symbolic systems or linguistics still insist that their specific theories are the premise of general intelligence. in theory, general intelligence cannot be achieved by neural networks or broader machine learning-especially if models are trained only by language.

Since the emergence of ChatGPT, these critics have become louder and louder.

Marcus: at Cue me? LLM's reasoning and language are very different from human beings. For example, Noam Chomsky, the father of modern linguistics, wrote about large language models: "We know from linguistics and philosophy of knowledge that they are very different from the way humans reason and use language. This difference greatly limits the function of these programs and encodes them with defects that cannot be eliminated. "

Gary Marcus, a cognitive scientist and contemporary artificial intelligence critic, says cutting-edge models "are learning how to sound and look human. But they don't really know what they're saying or doing. "

Marcus acknowledges that neural networks may be part of a general artificial intelligence solution, but believes that "in order to build a powerful, knowledge-driven artificial intelligence method, there must be a symbolic operation mechanism in our toolkit." "

Marcus (and many others) focuses on finding gaps in the capabilities of cutting-edge models, especially large language models, and often claims that they reflect the fundamental flaws of this approach.

These critics argue that without clear symbols, the "statistical" methods learned alone cannot produce real understanding.

Related to this, they claim that logical reasoning cannot occur without the concept of symbols, and that "real" intelligence requires such reasoning.

Leaving aside the question of whether intelligence always depends on symbols and logic, we have reason to question the claim that neural networks and machine learning are inadequate, because neural networks are very powerful in doing everything a computer can do. For example:

-Neural network can easily learn discrete or symbolic representation and appear naturally in the process of training.

Https://royalsocietypublishing.org/ doi / epdf / 10.1098 / rsta.2022.0041- advanced neural network models can apply complex statistical techniques to data and enable them to make near-optimal predictions based on given data. The model learns how to apply these techniques and choose the best technology for a given problem without explicit information.

Paper address: https://arxiv.org/ pdf / 2306.04637.pdf-stacking multiple neural networks in the right way produces a model that performs the same calculations as any given computer program.

Paper address: https://proceedings.mlr.press/ v202 / giannou23a.html- provides examples of the input and output of any function calculated by a computer. Neural networks can learn how to approach this function. (for example, 99.9% accuracy. )

Address: https://arxiv.org/ pdf / 2309.06979.pdf should distinguish between fundamentalist criticism and active discussion criticism.

Fundamentalist critics will say, "in order to be considered universal artificial intelligence, a system must not only pass this test, but must also be built in this way." "

We do not agree with such criticism on the grounds that the test itself should be sufficient-if not, the test should be modified.

On the other hand, the positive discussion criticism said: "I don't think you can let artificial intelligence work in this way-I think it would be better to do it in another way." "

Such criticism can help determine the direction of research. If a system can pass well-designed tests, these criticisms will disappear.

The language model generates the annotation of the image by linearly projecting the image coding to the input space of the language model.

In recent years, a large number of tests have been designed for cognitive tasks related to "intelligence", "knowledge", "common sense" and "reasoning".

These include new questions that cannot be answered by memorizing training data but need to be summarized-when we use questions that subjects have not encountered during learning to test their understanding or reasoning, we ask the subjects to provide the same proof of understanding.

Complex tests can introduce new concepts or tasks and explore candidates' cognitive flexibility: the ability to learn and apply new ideas in real time. This is the essence of situational learning. )

While AI critics try to design new tests to test that current models still perform poorly, they are doing useful work-although given that newer, larger models are overcoming these obstacles faster and faster, a few weeks delay may be a wise choice (again) eager to claim that artificial intelligence is "hype".

Why are human beings "unique"? As long as skeptics remain unimpressed by the indicators, they may be reluctant to accept any factual evidence from AGI.

This reluctance may be driven by a desire to maintain the particularity of the human spirit, just as humans have been reluctant to accept that the earth is not the center of the universe and that Homo sapiens is not the pinnacle of "great biological evolution".

It is true that there is something special about human beings, and we should keep them, but we should not confuse them with general intelligence.

Some voices believe that anything that can be counted as general artificial intelligence must be conscious, representative, and able to experience subjective perception or feelings.

But a simple reasoning turns out to be like this: a simple tool, such as a screwdriver, obviously has a purpose (screwdriver), but it cannot be said to be its own agent; on the contrary, any agent obviously belongs to the tool maker or tool user.

The screwdriver itself is "just a tool". The same reasoning applies to artificial intelligence systems that are trained to perform specific tasks, such as optical character recognition or speech synthesis.

However, systems with general artificial intelligence are difficult to be classified as pure tools. The skills of cutting-edge models are beyond the imagination of programmers or users. In addition, because LLM can be prompted by the language to perform any task, can generate new prompts in the language, and can indeed prompt itself ("thought chain hints"), the question of whether and when the frontier model has an "agent" needs to be considered more carefully.

Imagine many actions that Suleyman's "artificial power intelligence" might take to make 1 million dollars online:

It may study the Internet to see what is the hottest these days, find out the popular styles in Amazon stores, and then generate a series of images and drawings of similar products, which will be sent to the shipping manufacturer found on Alibaba. Then use email to refine the requirements and agree on the contract.

Finally, we design the list of sellers and update the marketing materials and product design according to the feedback of buyers.

As Suleyman points out, the latest models can theoretically accomplish all of these things, and models that can reliably plan and perform the entire operation may also be on the horizon.

The AI no longer looks like a screwdriver.

Now that there is a system that can perform any general intelligent task, it seems problematic to say that agent is equivalent to consciousness-which means that either the frontier model is conscious or the agent does not necessarily need to be conscious.

Although we don't know how to measure, verify, or falsify the existence of consciousness in intelligent systems. We can ask it directly, but we may or may not believe its answer.

In fact, "just ask" seems a bit like the Rorschach inkblot test: followers of AI perception will receive a positive response, while unbelievers will claim that any affirmative response is either a "parrot".

Or the current artificial intelligence system is a "philosophical zombie" that can act like human beings, but lacks any consciousness or experience "internally".

To make matters worse, the Rorschach inkblot test applies to LLM itself: they may answer whether they are conscious or not based on adjustments or prompts. (both ChatGPT and Bard are trained to answer that they are really unconscious. )

The dispute over consciousness or perception cannot be resolved at present because it depends on some kind of "belief" that cannot be verified (human and artificial intelligence).

Some researchers have proposed methods for measuring consciousness, but these methods are either based on unfalsifiable theories or rely on our own brain-specific correlations.

So these criteria are either arbitrary or unassessable, and do not have consciousness in systems that have our biological genetic characteristics.

The claim that abiotic systems cannot have intelligence or consciousness at all (for example, because they are "just algorithms") seems arbitrary and rooted in untestable spiritual beliefs.

Similarly, the idea that pain requires injury receptors, for example, may lead us to make some informed guesses about what the familiar pain experience is, but it is not clear how this idea can be applied to other neural structures or types of intelligence.

"what's it like to be a bat? This is a famous question raised by Thomas Nagle (Thomas Nagel) in 1974.

We don't know, and we don't know if we can know what bats look like, or what artificial intelligence looks like. But we do have more and more tests to assess the various dimensions of intelligence.

While it may be worthwhile to seek a more general and rigorous representation of consciousness or perception, any such representation will not change the measurement ability of any task. So it is not clear how these concerns can be meaningfully incorporated into the definition of general artificial intelligence.

It would be a more rational choice to separate "intelligence" from "consciousness" and "perception".

What kind of impact will AGI have on human society? The debate about intelligence and agency can easily turn into concerns about rights, status, power and class relations.

Since the industrial revolution, tasks considered "rote learning" or "repetitive" have often been done by low-paid workers, while programming-at first considered "women's work"-only when it became male-dominated in the industrial revolution, its intellectual and economic status will rise.

In the 1970s. Ironically, even for GOFAI, playing chess and solving points problems are easy, but even for today's most complex artificial intelligence, manual labor is still a major challenge.

In the summer of 1956, a group of researchers met in Dartmouth to study "how to make machines use language, form abstractions and concepts, and solve problems. How will the public react if AGI is implemented" on schedule "in some way? Now save it for mankind and improve yourself "?

At that time, most Americans were optimistic about technological progress. At that time, the economic benefits of rapidly developing technology were widely redistributed (though certainly unfair, especially in terms of race and gender). Despite the imminent threat of the cold war, for most people, the future looks brighter than the past.

Today, this redistribution has been reversed: the poor are getting poorer and the rich are getting richer.

When artificial intelligence is described as "neither artificial nor intelligent", but merely a repackaging of human intelligence, it is difficult not to interpret this criticism from the perspective of economic threats and insecurity.

In confusing the debate about what AGI should be and what it is, mankind seems to have violated David Hume's ban and should do its best to separate the "yes" from the "should" issues.

But this will not work, because the debate on what "should be" must be conducted honestly.

AGI is expected to create great value in the next few years, but it will also bring significant risks.

By 2023, the questions we should ask include-"who benefits?" "" who was hurt? "" how can we maximize benefits and minimize harm? "and" how can we do this fairly and justly? "

These are pressing issues that should be discussed directly, rather than denying the reality of general artificial intelligence.

Reference:

Https://www.noemamag.com/artificial-general-intelligence-is-already-here/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report