Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Is the Auto-GPT of eliminating ChatGPT a hype? Run the code by yourself, you don't need humans, GitHub has broken 50,000 stars

2025-01-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Is Auto-GPT a groundbreaking project or an over-hyped AI experiment? This article reveals the truth behind the hustle and bustle and reveals the limitations of Auto-GPT that are not suitable for practical applications.

These two days, Auto-GPT--, a model that allows the strongest language model GPT-4 to complete tasks on its own, drives the whole AI circle crazy.

The only thing that doesn't work well with ChatGPT, which is popular before, is the need for human beings to prompt.

One of the big breakthroughs of Auto-GPT is that it allows AI to prompt itself, that is, this AI does not need us humans at all.

In just seven days, it has achieved an amazing number of star on GitHub (more than 50,000) and attracted the attention of countless open source communities.

Project address: just take a look at how popular https://github.com/ Torantulino / Auto-GPT?ref=jina-ai-gmbh.ghost.ioAuto-GPT is-in just a few days, it equals the star of a popular project that has been accumulated for almost 11 years.

However, while partying for Auto-GPT, it is also necessary for us to step back and examine its potential shortcomings and explore the limitations and challenges faced by this "AI child prodigy".

Recently, Jina AI CEO Han Xiao published a long article "revealing Auto-GPT: the hype and hard truth of production traps", which discussed with us whether Auto-GPT is a groundbreaking project or another over-hyped artificial intelligence experiment.

How does Auto-GPT work? It has to be said that Auto-GPT has made great waves in the field of AI, just like giving GPT-4 memory and entities, enabling it to deal with tasks independently, and even learn from experience, and constantly improve its performance.

In order to facilitate how Auto-GPT works, let's break it down with some simple metaphors.

First, imagine that Auto-GPT is a resourceful robot.

Each time we assign a task, Auto-GPT gives a corresponding solution plan. For example, if you need to browse the Internet or use new data, it will adjust its strategy until the task is completed. It's like having a personal assistant who can handle a variety of tasks, such as market analysis, customer service, marketing, finance, etc.

Specifically, to make Auto-GPT work, you need to rely on the following four components:

Architecture:

Auto-GPT is built using powerful GPT-4 and GPT-3.5 language models that act as the brain of a robot to help it think and reason.

Autonomous iteration:

It's like the ability of a robot to learn from mistakes. Auto-GPT can review its work, build on previous efforts, and use its history to produce more accurate results.

Memory Management:

Integration with vector databases, an in-memory storage solution, enables Auto-GPT to preserve context and make better decisions. It's like equipping a robot with a long-term memory that allows you to remember past experiences.

Versatility:

Auto-GPT has a wide range of uses because of its functions such as file operation, web browsing and data retrieval. It's like giving robots a variety of skills to handle a wider range of tasks.

However, these tempting prospects may not have been translated into capabilities that Auto-GPT can really achieve.

The sky-high cost if you want to use Auto-GPT in a real production environment, the first obstacle is its high cost.

Because the task needs to be accomplished through a series of thought iterations, each step of the model usually uses up all token in order to provide better reasoning and hints.

However, GPT-4 's token is not cheap.

According to OpenAI, the GPT-4 model with an 8K context window charges $0.030 per 1000 token for prompts and $0.060 for 1000 token for results.

And 1000 token can be converted into 750 English words.

Let's break down the cost of each step in the thought chain, assuming that 8000 token context windows are exhausted for each action, of which 80% are prompts (6400 token) and 20% are results (1600 token).

Hint cost: 6400 token x 0.03 USD / 1000 token = 0.192 USD

Result cost: 1600 tokens x 0.06 U.S. dollars / 1000 token = 0.096 dollars

Therefore, the cost per step is: $0.192 + $0.096 = $0.288

On average, Auto-GPT takes 50 step to complete a small task.

Therefore, the cost of completing a single task is: 50 step x $0.288 / step = $14.40

VUEGPT THOUGHTS: Let's start by checking if there are any updates to VueJS. If there are, we can update to the latest version and proceed. Otherwise, we can move on to creating the TODO list website lication.REASONING: Starting with the most updated and stable version of the framework will ensure our project has the latest features, bug fixes and is properly supported. Creating the TODO list website lication is our primary goal, so we can move on to that if there are no updates.PLAN:-Check for VueJS updates- Update to latest version if there are updates- If no updates, move on to creating the TODO list website licationCRITICISM: NoneNEXT ACTION: COMMAND = google ARGUMENTS = {'input':' VueJS latest version update'} Enter 'y' to authorise command,'y-N' to run N continuous commands,'n' to exit program, or enter feedback for VueGPT... Take VueGPT as an example: this is an AI created by Auto-GPT, which aims to create a website application using Vue JS. Let's take a look at its step in the chain of thought.

And this is a situation that can produce results at once, and if it needs to be regenerated, the cost will be higher.

From this point of view, Auto-GPT is currently unrealistic for most users and organizations.

At first glance, there seems to be nothing wrong with development and production spending $14.4 on a complex task.

For example, let's first ask Auto-GPT to make a Christmas recipe. Then, if you ask it for a Thanksgiving recipe, guess what happens?

Yes, Auto-GPT will do it all over again in the same chain of thinking, that is, we need to spend another $14.40.

But in fact, there should be only one difference between the two tasks in terms of "parameters": festivals.

Now that we've spent $14.4 to develop a way to create recipes, it's obviously illogical to spend the same amount of money to adjust the parameters.

Imagine playing "Minecraft" and building everything from scratch every time. Obviously, this makes the game very boring, which exposes a fundamental problem with Auto-GPT: it can't distinguish between development and production.

When Auto-GPT achieves its goals, the development phase is complete. Unfortunately, there is no way to "serialize" this series of operations into a reusable function and put it into production.

Therefore, every time users want to solve a problem, they must start from the starting point of development, which is not only time-consuming and laborious, but also expensive.

This inefficiency raises questions about the usefulness of Auto-GPT in real-world production environments and highlights the limitations of Auto-GPT in providing sustainable and cost-effective solutions to large problems.

The quagmire of the cycle, however, if $14.4 really solves the problem, then it is still worth it.

But the problem is that when Auto-GPT is actually used, it often falls into a dead cycle.

So why does Auto-GPT fall into these cycles?

To understand this, we can think of Auto-GPT as relying on GPT to use a very simple programming language to solve tasks.

The success of solving a task depends on two factors: the range of functions available in the programming language and GPT's divide-and-conquer ability (divide and conquer), that is, how well GPT can decompose the task into predefined programming languages. Unfortunately, GPT is inadequate on both points.

The limited functionality provided by Auto-GPT can be observed in its source code. For example, it provides functions for searching the network, managing memory, interacting with files, executing code, and generating images. However, this limited feature set narrows the range of tasks that Auto-GPT can perform effectively.

In addition, the decomposition and reasoning ability of GPT is still limited. Although GPT-4 has a significant improvement compared with GPT-3.5, its reasoning ability is far from perfect, which further limits the problem-solving ability of Auto-GPT.

This situation is similar to trying to use Python to build complex games like StarCraft. Although Python is a powerful language, it is challenging to break down StarCraft into Python functions.

In essence, the combination of the limited feature set and the limited reasoning ability of GPT-4 eventually leads to the quagmire of this cycle, making Auto-GPT unable to achieve the expected results in many cases.

The difference between human beings and GPT is the key to Auto-GPT. Although GPT-3.5/4 has made remarkable progress on the basis of its predecessor, its reasoning ability still cannot reach human level when using divide-and-conquer method.

The problem is not fully decomposed:

The effectiveness of the divide-and-conquer method depends to a large extent on the ability to decompose complex problems into smaller, manageable subproblems. Human reasoning can usually find a variety of ways to decompose problems, while GPT-3.5/4 may not have the same degree of adaptability or creativity.

Difficulty in identifying the appropriate basic case:

Human beings can intuitively choose appropriate basic cases to get effective solutions. In contrast, it may be difficult for GPT-3.5/4 to determine the most effective basic case for a given problem, which can significantly affect the overall efficiency and accuracy of the divide-and-conquer process.

Insufficient understanding of the background of the problem:

Although human beings can use their domain knowledge and background understanding to better deal with complex problems, due to the limitation of their pre-trained knowledge, GPT-3.5/4 may lack the background information needed to solve some problems effectively by divide and conquer.

Deal with overlapping sub-issues:

Humans can usually recognize when solving overlapping sub-problems and strategically reuse previously calculated solutions. GPT-3.5/4, on the other hand, may not have the same level of awareness and may solve the same sub-problems redundant many times, resulting in inefficient solutions.

Vector DB: excessive solution

Auto-GPT relies on vector databases for faster k-nearest neighbor (kNN) searches. These databases retrieve previous thought chains and incorporate them into the context of the current query to provide a memory effect for GPT.

However, considering the constraints and limitations of Auto-GPT, this approach has been criticized as excessive and unnecessary resource consumption. Among them, the main argument against the use of vector databases comes from the cost constraints related to the Auto-GPT thinking chain.

A 50-step chain of thought will cost $14.40, while a 1000-step chain will cost more. As a result, the size of memory or the length of the chain of thought rarely exceeds four digits. In this case, the exhaustive search of the nearest neighbor (that is, the dot product between the 256D vector and the 10000 x 256matrix) is proved to be efficient enough and takes less than a second.

By contrast, each GPT-4 call takes about 10 seconds to process, so it's the GPT, not the database, that actually limits the system's processing speed.

Although vector databases may have some advantages in certain scenarios, implementing vector databases in Auto-GPT systems to speed up kNN "long-term memory" search seems to be an unnecessary luxury and excessive solution.

The birth of the agent mechanism Auto-GPT introduces a very interesting concept that allows agents to be generated to delegate tasks.

Although this mechanism is still in its infancy, its potential has not been fully tapped. However, there are many ways to enhance and expand current agent systems to provide new possibilities for more efficient and dynamic interactions.

The use of asynchronous agents can significantly improve efficiency. One potential improvement is the introduction of asynchronous agents. By combining the asynchronous waiting mode, agents can operate concurrently without blocking each other, thus significantly improving the overall efficiency and response speed of the system. This concept is inspired by modern programming paradigms that have adopted an asynchronous approach to managing multiple tasks at the same time.

Another promising direction is to realize the communication between agents. By allowing agents to communicate and collaborate, they can solve complex problems more effectively. This approach is similar to the IPC concept in programming, where multiple threads / processes can share information and resources to achieve a common goal.

Generative agents are the future direction with the continuous development of GPT-driven agents, the future of this innovative approach seems to be very bright.

New studies, such as "Generative Agents: Interactive Simulacra of Human Behavior", emphasize the potential of agent-based systems in simulating credible human behavior.

The generative agents proposed in this paper can interact in complex and fascinating ways, form opinions, initiate dialogues, and even plan and participate in activities independently. This work further supports the argument that the agent mechanism has a bright future in the development of AI.

By embracing the paradigm shift for asynchronous programming and promoting communication between agents, Auto-GPT can open up new possibilities for more efficient and dynamic problem solving.

The integration of large-scale language model, computing and interactive agent can be realized by integrating the architecture and interaction patterns introduced in the paper of "generative agent". This combination has the potential to revolutionize the way tasks are assigned and executed within the AI framework and to achieve more realistic simulations of human behavior.

The development and exploration of agent systems can greatly promote the development of AI applications and provide more powerful and dynamic solutions to complex problems.

To sum up, the heated debate around Auto-GPT raises important questions about the status of AI research and the role of public understanding in promoting the hype of emerging technologies.

As shown above, the limitations of Auto-GPT 's reasoning capabilities, the overuse of vector databases, and the early stages of development of the agent mechanism reveal that it is still a long way from becoming a practical solution.

The hype around Auto-GPT reminds us that superficial understanding may make expectations too high and eventually lead to distorted perceptions of the true capabilities of AI.

Having said that, Auto-GPT does point out a promising direction for the future of AI: generative agent systems.

Finally, Han Xiao concluded: "Let's learn from the hype of Auto-GPT and foster a more detailed and informed dialogue about AI research." "

In this way, we can take advantage of the transformative power of generative agent systems to continue to push the boundaries of AI capabilities and shape a future in which technology truly benefits mankind.

Reference:

Https://jina.ai/news/auto-gpt-unmasked-hype-hard-truths-production-pitfalls/

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report