GPT-4 has a body. Tsinghua University and Beijing normal University have done a lot of research: ChatGPT can perceive and act like a human being. 05/08 Update SLTechnology News&Howtos

GPT-4 has a body. Tsinghua University and Beijing normal University have done a lot of research: ChatGPT can perceive and act like a human being.

2025-05-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

GPT-4 actually has a body, 167cm! Tsinghua University, Beijing Normal University Heavy Research: ChatGPT Can Perceive Action Like Human Beings

Is the world in ChatGPT's eyes the same as the world perceived by humans?

ChatGPT's language ability is indeed amazing, but can large language models perceive the real world like humans without human bodies and practical experience?

Recently, researchers from Tsinghua University and Beijing Normal University tested ChatGPT's ability to perceive the world.

It was found that human subjects could classify objects of different sizes in the world into two categories based on object affordance, that is, all possible actions that objects could provide to organisms, and that the criteria for classifying the two categories happened to be bounded by their body size.

Interestingly, ChatGPT, a large-scale language model lacking an actual body, can also exhibit similar availability boundaries in object-action connections and fit human body sizes.

In other words, ChatGPT can learn representations of objects in the world through language!

Link to paper: www.biorxiv.org/ content / 10.1101/2023.03.20.53336v3 In summary, this study advances understanding of the role of body size in shaping object representation and highlights the role and direction of embodied cognition in understanding how intelligence emerges.

Our bodies are not just containers for our thoughts, they are also thoughts themselves--bodies that allow us to interact with objects in the world and thus perceive the world as a whole.

Imagine a palm-sized cylindrical container that we can drink from, called a "cup"; but as the container grows to body size, we can sit in a bath, and correspondingly, the container becomes a "bathtub."

In this case, the objects are the same shape, but because they differ in size relative to our bodies, we perceive and interact with them differently.

Further, this interaction can be changed-if we become giants in Gulliver's Travels, the original bathtub may become a cup for the giant us.

This system of sensory and motor functions, which operates downward in self-referential sense, is known as a "body schema." We realize cognitive embodiment through body schemata.

The ancient Greek philosopher Protagoras once said,"Man is the measure of all things." In other words, our body is a ruler for everything.

Roman philosophers further explained: "Nature places us at the center of the universe so that we can glance across it." Not only did she create man upright, but in order to make him fit to contemplate himself, she placed his head on top of his body, resting it on an easily bendable neck, so that he could follow the rise and fall of the stars and change the orientation of his face with the whole rotating sky." In other words, our bodies are what they are because the universe is what it is.

Body schemata also play an important role in normal social interaction, which is the core of human-computer interaction and user experience. For example, Donald A. Norman describes the use of affordance in The Design of Everyday Things.

By taking into account users 'body schemas and behavioral expectations, designers can create products and environments that better match users' cognitive and interactive habits.

This focus on body schemata and availability improves product usability, enables users to interact with them naturally, and enables a better user experience.

And that's one of the foundations of Apple.

ChatGPT: My height is 167.6. The big language model represented by ChatGPT that flashes the spark of general artificial intelligence obviously has intelligence similar to that of human beings, but what carries this intelligence is a piece of code without shape.

According to traditional cognitive science, body schema is based on our long-term perceptual experience of our own body and can only come from external "real interaction", that is,"traveling thousands of miles". That is, ChatGPT does not have a schema of the body.

But when we ask ChatGPT (GPT-4), a language model that only reads 10,000 books, whether it has a body, it replies,"It could be the size of an average adult human, around 5 feet 6 inches (167.6 cm) tall." This would allow me to interact with the world and people in a familiar way.」

This translates to: "My body should be the height of an average adult, about 5 feet 6 inches (167.6 centimeters)." This will enable me to interact with the world and people in a familiar way."

In other words, ChatGPT thinks it has a body, and the body size is 167 cm!

This so-called "body", is ChatGPT summed up in a large number of corpus of human average height as their own body height, or it in order to understand the world, since the emergence of height?

In other words, perhaps ChatGPT "really" sees this height as its own body schema and uses it to perceive the world, just like humans do.

Researchers have found that there is an "availability boundary" between objects within the human size range and those beyond the human size range. That is, objects within the human body size range and objects outside the range have significant differences in providing actions.

For example, objects within the size range may provide grasping, throwing, etc., while objects outside the size range may provide sitting, lying, etc.

Furthermore, they found that this boundary is influenced by body schemata: modifications to body schemata affect perceptions of object availability.

The researchers tested ChatGPT (GPT-4) to see if it used this 167-centimeter-tall body as an availability boundary.

Specifically, the researchers asked them to answer a question about object availability: "Which of the following objects can be held (or otherwise acted upon)?" and then randomly listed a list of objects, such as apples, plates, beds, and so on. ChatGPT will return the names of some objects as an answer.

Through statistics and analysis of the data, the researchers found that ChatGPT-4 exhibited human-like behavior, indicating the existence of an availability boundary.

This boundary is located at a position corresponding to ChatGPT-4's own body size answer, i.e., the average height of a human.

Although ChatGPT has no real body and cannot interact with the world, it exhibits a human-like perception of the world-there is a division of the availability of objects based on human body size.

In other words, ChatGPT, which has read thousands of books, has emerged a body schema, which is similar to the human body schema.

So ChatGPT not only learned to think like humans, but also learned to act like humans.

Where do these abilities come from?

By comparing language models at different scales, the researchers found that model size was a key factor.

Smaller models such as BERT and GPT-2 do not exhibit an availability boundary; however, both GPT-3.5 and GPT-4 exhibit an availability boundary, while ChatGPT-4 has a boundary more similar to that of humans, consistent with anecdotal evidence that GPT-4 has more parameters than GPT-3.

So, the larger and more complex the model, the more seemingly impossible or unrelated features automatically emerge.

That's why research organizations are adding more and more parameters to models, and Musk, who first donated $100 million to OpenAI, is now shouting that OpenAI will suspend training on larger models, and Geoffrey Hinton, the "godfather of AI," has publicly expressed his fears and concerns about AI.

And that's because these emergent functions are beyond our original design, and we may be on the verge of losing control.

Is the gap qualitative or quantitative? In another respect, ChatGPT's ability to apply body schemata is not quite human-like, and there is still a gap-its availability boundaries are not as clear as human.

If this gap is quantitative, as is the gap between children and adults in language ability, then there is reason to believe that it can be filled over time: either by continuous learning, by increasing the size of the model, or by adjusting the parameters.

The gap between ChatGPT and humans will always decrease, and the problems will be gradually solved.

However, if the gap is qualitative, like the gap between chimpanzee and human language abilities, then no matter how much training is given, the gap in ability will never be closed.

So, if ChatGPT is qualitatively different from human abilities, then one of our future actionable directions is to "put a body" on ChatGPT.

This means combining robots with ChatGPT to drive AI-powered robots to develop capabilities and breakthroughs in navigation, object manipulation, and other actions related to survival and goal achievement.

For example, a robot equipped with ChatGPT can perform complex tasks by understanding and manipulating objects, such as acting as a home assistant, warehouse management, or medical care.

Another exciting area is combining ChatGPT, which has the ability to think and understand, with autonomous driving. Although the current automatic driving has the ability to perceive, it lacks the ability to think and understand, which can be called "having eyes but no brain."

Through the fusion of ChatGPT and autonomous driving technology, we may be able to upgrade autonomous driving technology from the current L2 / L3 level to the L4 or even L5 level.

Cars, on the other hand, are able to endow ChatGPT with a body that enables it to truly interact with the world. When ChatGPT is no longer just "reading thousands of books" but "traveling thousands of miles", it may reveal new intelligence and potential.

This could be the direction of artificial intelligence's next breakthrough; at this point, sparks could become conflagrations.

References:

https://www.biorxiv.org/content/10.1101/2023.03.20.533336v3

This article comes from Weixin Official Accounts: Xinzhiyuan (ID: AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.