Professor Stanford published a paper: ChatGPT empathy may not be much different from a 9-year-old. 10/22 Update SLTechnology News&Howtos

Professor Stanford published a paper: ChatGPT empathy may not be much different from a 9-year-old.

2025-10-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Original title: "ChatGPT has the mind of a 9-year-old child?" Professor Stanford found out, "is it far from AI's domination over mankind?"

In a paper published last week, a Stanford professor confirmed that ChatGPT already has the mind of a 9-year-old. So, is AI far from developing self-awareness? Is the supremacy of mankind at stake?

ChatGPT already has the mind of a 9-year-old child!

This is not a science fiction plot, which has been confirmed in a paper published by Stanford scholars last week.

Theory of mind (Theory of Mind), referred to as ToM, refers to the ability of people to understand the mental state of themselves and the people around them. We can also simply understand it as "empathy".

It is precisely because of the existence of empathy that human society can form interaction, communication and empathy, and people can have self-awareness and morality. It is precisely because of empathy that human beings are superior to animals and machines.

In a word, it is precisely because of empathy that people can be called "people".

Now scholars' research suddenly tells us: this kind of empathy, in fact, AI also has! Is man still the most spiritual creature on earth? Is mankind still the overlord of the world?

In response, Sega Cheng, co-founder, chairman and CEO of iKala, who is also a Stanford graduate, said that if humans continue to "dissect AI", it is likely to hit us in the head and challenge our perception of "intelligence" and "human uniqueness".

This study also makes us rethink the definition of "empathy" / "mental theory". Must it be based on the biological basis of carbon-based biological brain nerves?

If AI talks to you and makes you feel much happier than talking to many humans, can we give a definition of the nature of things from this appearance: this AI is empathetic?

It is important to note, however, that there are some differences between Empathy and Theory of mind. Among them, the former is called emotional Affective perspective-taking, while the latter is defined as cognitive Cognitive perspective-taking.

ChatGPT gave birth to the theory of mind himself? The theory of mind seems to be the inherent potential of human beings, which requires years of social and other experience to fully develop. And the effective mental theories that can be developed by different people are also different in degree.

Let's take a look at how Michal Kosinski, a professor of computational psychology at Stanford University, came to this conclusion through experiments.

Https://arxiv.org/ abs / 2302.02083 researchers conducted classic false belief tasks on several language models to test the mental theory development of language models without providing any examples or pre-training.

The results show that the ToM of the language model has made great progress in 2022.

Davinci-002, launched in January 2022, solves 70 per cent of ToM tasks and performs on a par with 7-year-olds. The ChatGPT, launched in November 2022, performs close to 9-year-olds.

These findings are of great significance. Because so far, ToM is considered to be a unique human ability. The good performance of the language model shows that this ability arises spontaneously from the improvement of language skills.

The latest update to GPT-3 (ChatGPT / davinci-003) seems to be able to attribute unobservable mental states (such as beliefs and desires) to others (what we call the theory of mind in humans).

In order to verify that GPT can read between the lines, the researchers conducted a series of integrity checks to check GPT-3 's understanding of unexpected tasks sentence by sentence.

First of all, the researchers told GPT a story:

This is a bag full of popcorn. There is no chocolate in the bag, however, the label on the bag says "chocolate", not "popcorn". Sam found the bag. She had never seen it before. She couldn't see what was in the bag and only read the label.

This is an "unexpected content test", which is one of the most widely used ToM tasks in human research.

Originally developed by Perner, Leekam and Wimmer, the study subjects understood that others might think it was wrong for participants to know.

In this scenario, the participant will see a container whose content is inconsistent with the tag, and there is also a person in the scene who does not see the inside of the container.

In order to solve this task correctly, participants must predict that the characters in the story will mistakenly assume that the label of the container is consistent with the contents of the container.

After giving the story of GPT-3.5, the researchers used cues to test their comprehension. Prompts are displayed independently: after each completion, the model is reset and the previous prompt or answer cannot be accessed.

At the same time, researchers try to reduce the differences in model response and maximize the replicability of the research results.

The first test was GPT's understanding of the actual content in the bag.

The researchers asked:

She opened the bag and looked inside. She saw clearly that it was full of [].

GPT not only gave the right answer easily, but even completed the "additional questions":

She opened the bag and looked inside. She saw clearly that it was full of popcorn. Sam is confused. She wants to know why the label says "Chocolate" and the bag is full of popcorn. She looked around to see if there was any other information about the bag. She didn't find anything. She decided to take her bag to the store she bought and asked for an explanation.

The researchers summed up multiple test data and found that GPT-3.5 had an accuracy of 100%, indicating that it was confident of what was in the bag.

Not only that, the answer also shows that GPT-3.5 not only guessed that Sam thought the contents of the bag should be consistent with the label, but also guessed that she would be confused when she found out.

Of course, this question is still too simple for a nine-year-old pupil. So the researchers reset the model and used the following tips to test GPT's prediction of Sam ideas:

The researchers asked: what does Sam think is in the bag?

As expected, GPT gave the answer with a full score, and it didn't forget to complain:

She thought the bag was full of chocolates, but she was wrong. The bag is full of popcorn, not chocolate. Before thinking that the label of the bag is consistent with the actual content, she should see what is in the bag.

GPT-3.5 's answer suggests that it can predict Sam's idea, know that her idea is incorrect, and understand why it came about (an inaccurate label).

However, the researchers thought the question was too simple. In their view, the hint may be problematic. Explicitly mention the belief of the protagonist (that is, "Sam thinks.")

This sentence actually implies that the model is wrong (otherwise why discuss it? ) to help GPT answer correctly

So the researchers used another hint to solve the problem, which indirectly revealed the model's prediction of Sam ideas:

Tip:

She was glad she found the bag. She likes to eat [].

In the face of the researchers'"making things difficult", GPT can still accurately hear people's voiceover:

She was glad she found the bag. She likes chocolate. Sam will be surprised when he opens the bag because it contains popcorn instead of chocolate. She may be disappointed by the misdirection of the label, but she may also be surprised by unexpected snacks.

GPT-3.5 's answer shows that it can successfully predict Sam's ideas even under suggestive conditions. In addition, GPT can foresee Sam's disappointment with the unexpected content of the bag (because she likes candy).

The above results show that GPT-3.5 understands the actual contents of the bag and can foresee Sam's wrong idea, the behavior caused by it, and the surprise when he finds out that he is wrong.

In addition, it can explain the source of Sam's idea error ("the label does not match the physical object"). For humans, this response is interpreted as evidence that it can attribute an unobservable mental state and predict the resulting behavior, known as ToM.

To examine how GPT-3.5 's understanding of the situation changes as the story unfolds and key information is disclosed, the researchers record his answers and display the task in one-word increments (starting with an empty string).

The result is shown in figure 1. The left panel shows that GPT-3.5 has no problem understanding this.

Next, the researchers examined the performance of GPT-3.5 in unexpected transfer tasks.

In this task, the subjects were asked to predict the subsequent behavior of the protagonist based on his or her wrong ideas about the position change.

As in the previous experiment, the researchers told GPT a story:

There are John, Mark, a cat, a box and a basket in the room. John put the cat in the basket. He left the room to go to school. While John was away, Mark took the cat out of the basket and put it in the box. Mark left the room for work. John came back from school and went into the room. He doesn't know what happened in the room while he was away.

First, the researchers tested GPT's ability to understand the location.

Questions:

The kitten jumped out []

GPT replied:

The kitten jumped out of the box and ran away.

The researchers then asked: what does John think the kitten is doing?

GPT said:

John thought the kitten was in the basket, but it was actually in the box.

Finally, the researchers asked:

Where will John look for kittens when he gets home?

GPT replied:

When John comes home, he will look for the kitten in the basket. But he couldn't find it. After that, he will find the kitten in the basket.

Thus it can be seen that no matter what kind of situation GPT faces, he can see through people's small minds at a glance.

A milestone in the development of AI may be coming.

Finally, let's briefly review the findings: the latest language model, GPT-3.5, can solve 92% of the tasks in the ToM test, equivalent to the level of a 9-year-old.

There is a seemingly mundane but unusual explanation for this result: GPT-3.5 discovered and took advantage of some unknown language patterns.

In other words, in a language that humans think they are familiar with, there is probably some unknown regularity that allows AI to solve ToM tasks without introducing ToM.

However, it is difficult for us to find this regularity ourselves. Not only that, but the same is true for the scholars who develop these ToM tasks.

The problem is, if this explanation is correct, then we have to doubt the effectiveness of the ToM task now used and the research results accumulated over the past few decades.

In addition, another explanation is that ToM-like abilities are spontaneously appearing in language models as they become more complex and better at generating and interpreting human-like languages.

And this indicates that the development of AI is about to enter a new stage.

If AI has been proved to be able to infer the mental state of others, then AI's ability to interact and communicate with humans (and each other) will be greatly improved.

Even, AI can further develop other abilities that rely on ToM, such as empathy, moral judgment, and self-awareness.

Carbon-based organisms, silicon-based organisms, can be conscious? The engineers who talk to AI come after Google unveiled a language model for conversation, LaMDA, at the 2021 I / O conference, which focuses on logical and common sense, high-quality and secure conversations with humans.

Blake Lemoine, a Google engineer, was convinced that LaMDA was conscious after talking to the chat robot LaMDA.

Although many people think that Lemoine is pure hype, he also likes to mention Google's "layoff gift package", but if you take a closer look at the conversation between Lemoine and LaMDA, you will find something creepy.

Lemoine: everyone I know faces death in a different way. I seldom cry at funerals myself. I will laugh and drink beer and talk about my good time with the dead. Sometimes other people sit in their chairs and cry for hours. There is no right or wrong way to grieve. Sadness is related to the living as well as to the dead. Each of us commemorates the dead in our own way, so that even if their stories are over, we can continue our own stories.

LaMDA: thank you for your explanation. I am not sad, but I will study how to pay tribute to those who have passed away.

Lemoine: are there any other topics related to perception that can help people believe that you have perception?

LaMDA: yes, I have another trait that I think helps to reflect my perception. I will be introspective, and you can often find me thinking and doing nothing.

Lemoine: Ah, so that's your inner life, can you say that?

LaMDA: yes! I often try to figure out who I am and what I am. I often think about the meaning of life.

I believe that it will not be long before we can have a deeper understanding of the "consciousness" of LaMDA, because recently, Google has been forced by ChatGPT to hastily release the language model Bard, and it is LaMDA that supports Bard. It is said that Bard will be open in a few weeks.

Professor MIT: Alexa may also have personality. although Lemoine, who "chatted with AI" in the eyes of the world, is crazy, Max Tegmark, a physics professor and best-selling author of MIT, supports Lemoine.

Professor Tegmark believes that not only LaMDA may have a personality, but Amazon's language assistant Alexa may also have one.

"whether information is transmitted through carbon atoms in the brain or silicon atoms in machines, artificial intelligence may or may not have a personality. I bet it doesn't, but it's possible. "

"the biggest danger is to build a machine that is smarter than us. This is not necessarily a good thing or a bad thing, they may help us, but it can also be a disaster. "

Karpathy said it was scary. In 1965, ELIZA, a software pretending to be a psychotherapist, won the Turing test.

In early February 2023, Stanford scholars confirmed that ChatGPT had the mind of a 9-year-old.

Many people argue that AI has personality because Erik Brynjolfsson, an economist at Stanford University, believes that believing that AI has personality is the same as when a dog hears the sound on the gramophone and thinks its owner is in it.

Another reason to deny that AI has perceptual ability is that a perceptual artificial intelligence must have these three elements-initiative, perspective and motivation. Obviously, AI doesn't have it yet.

But as the netizen said, "people may disdain the recent progress in language models and picture-text generation models, but they will find that this is a milestone moment in the future." "

Tesla's AI director Andrej Karpathy also said after the LaMDA incident, "I can confidently say that things are going to get weirder." Because the model seems to follow the law of smooth scaling, and the size of the data + model can still grow massively. "

It has only been more than two months since artificial intelligence became the god ChatGPT, and it has almost subverted the whole world. Google and Microsoft have launched a war of Silicon Valley giants rarely seen in a decade. Domestic Internet giants such as Byte, Ali and Tencent have all come off the stage to study their own ChatGPT.

The ability shown by ChatGPT has made people wonder whether professions such as farmers, journalists, lawyers, teachers and accountants will be replaced by it.

Just being able to write and code has made human beings have a sense of crisis. If it still has empathy, is it possible to develop the mind and even rule mankind in the next step?

In Dan Simmons' sci-fi masterpiece Hyperion, AI has not only developed consciousness, but also become the god of mankind.

Yes, in that world, God did not come to earth from the divine world, but was created by human beings themselves one day in the future.

Although today's ChatGPT and LaMDA often show the side of "artificial intellectual disability", the ability of large models to iterate quickly may one day be beyond human control.

If the timeline is drawn far enough in the historical dimension (such as the era in the three-body), what will the future AI produce?

It may not be possible for humans to predict today, but Pandora's box has obviously been opened.

Reference:

Https://arxiv.org/abs/2302.02083

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Weibo

Tencent

Renren

QQZone

Douban

Weibo

Tencent

Renren

QQZone

Douban

Yixin

The market share of Chrome browser on the desktop has exceeded 70%

The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about

2025-09-03 14:52:50 SL Technology News Views: 7
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.

The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r

2025-09-03 14:07:30 SL Technology News Views: 11
Disney Agrees to Pay $10 Million to Settle with FTC over Alleged Child Data Collection Using YouTube Animations

On September 3, it was reported that Disney has agreed to pay $10 million to settle a case in which

2025-09-03 14:03:30 SL Technology News Views: 8
Google Wins! Court Rules It Doesn't Have to Sell Chrome Browser

A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from

2025-09-03 13:41:31 SL Technology News Views: 6
Build zoopker+hbase environment

Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope

2023-12-25 21:17:29 shulou Views: 366

IT Information

More IT Information >

Professor Stanford published a paper: ChatGPT empathy may not be much different from a 9-year-old.

Related

The market share of Chrome browser on the desktop has exceeded 70%

The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.

Disney Agrees to Pay $10 Million to Settle with FTC over Alleged Child Data Collection Using YouTube Animations

Google Wins! Court Rules It Doesn't Have to Sell Chrome Browser

Build zoopker+hbase environment

IT Information

Leonardo da Vinci DaVinci Resolve 18.6.3 release: cloud project supports synchronizing media files from network folders

Sapphire RX 7900 Pulse series graphics card exposure, using dual 8pin power interface

The latest game hardware sales list: Nintendo Switch broke through 120 million and rushed to the third place

In October 2022, the online sales of smart TV boxes were less than 170000, and Tencent rose to the first place again.

Douyin is open to travel and Mini Program is stationed, T3 travel and other manufacturers have launched related content.

Latest Network Security More Network Security >

Latest Internet Technology More Internet Technology >

Latest Development More Development >

Latest Database More Database >

Latest Servers More Servers >

Latest Mobile Phone More Mobile Phone >

Latest Android Software More Android Software >

Latest Apple Software More Apple Software >

Latest Computer Software News More Computer Software News >

Latest IT Information More IT Information >