Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

OpenAI suffered a glorious defeat in the Dota 2 game

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Just last week, humans fought back against robots. Actually beat them in video games.

In a Triwizard tournament, two professional teams defeated an artificial intelligence robot developed by OpenAI, a research laboratory founded by Elon Musk. The Dota 2 computer game they play is a very popular and fiercely complex competitive game. And this competition is a touchstone for artificial intelligence: it will be the latest measure of artificial intelligence created by humans that is beyond imagination.

Some significant advances have been made in artificial intelligence technology. The most striking example in recent years has been DeepMind's AlphaGo's defeat of the go world champion, an achievement that experts believe will not be achieved for at least a decade. Recently, however, researchers have taken the participation of artificial intelligence in video games as the next challenge. Although video games are not as popular as AlphaGo and International, they are actually much more difficult to play. This is because gamers will hide all kinds of information, complex game environments are constantly changing, and strategic thinking that cannot be easily simulated. In other words, it's closer to the problems we want to solve in real life with artificial intelligence.

The failure of OpenAI is only a setback in the progress of artificial intelligence.

Dota 2 is a very popular artificial intelligence test site, and OpenAI has the best Dota 2 robot. But last week, OpenAI lost. So what happened? Have we reached some limit on the ability of artificial intelligence? Does this mean that some skills are too complex for computers?

The answer is no. Machine learning researcher and Dota 2 game fan Stephen Merity said that this is only a "hurdle", the machine will eventually conquer gamers, and OpenAI may subvert people's perception. But first you need to understand why humans win and achieve the goals of OpenAI, which are still useful even if they fail. It will tell artificial intelligence what it can and cannot do, and what will happen in the future.

Dota 2 game screenshot, this is a fantasy competitive combat game, two groups of five teams fighting to destroy each other's bases. The game is very complicated, and the game usually lasts more than 30 minutes.

Learn like a robot: if you don't succeed at first

First of all, let's take a look at last week's game. These game robots were created by OpenAI. As part of its broad scope of research, OpenAI hopes to develop artificial intelligence that "benefits all mankind". This justifies many different technical studies and attracts some of the best scientists in the field. The lab says that by training the Dota 2 robot team, known as OpenAI Five, it hopes to develop systems that can "deal with the complexities and uncertainties of the real world".

The five robots (operated independently but trained using the same algorithm) are trained through reinforcement learning to operate Dota 2 games. This is a common training method, basically trial and error on a large scale (it has its own weaknesses, but it can also produce incredible results, including AlphaGo). Instead of programming robots using the rules of the Dota 2 game, they throw them directly into the game and let them solve problems on their own. OpenAI engineers speed up the process by rewarding them for certain tasks, such as killing opponents or winning games, but that's it.

This means that robots operate at random at the beginning of training, and over time, they learn to associate certain behaviors with rewards. As one might think, this is a very inefficient way of learning. As a result, robots must speed up the game, and the daily training experience gained is equivalent to 180 years of human training. As Greg Brockman, chief technology officer and co-founder of OpenAI, said earlier this year, if it takes 12000 to 20000 hours of practice to master a skill, robots experience "a lifetime of 100 people" every day.

Part of the reason it takes so long is that Dota 2 games are so complex that they are far more complex than board games. The two five-man teams confront and fight each other on the game map, full of unpredictable characters, obstacles, and destroyable buildings, all of which affect the situation and progress of the battle. Gamers must combine various forces to fight against their opponents to destroy them. They can acquire or buy hundreds of items to improve their abilities, and each player (with more than 100 characters) has his or her own unique actions and attributes. Every game in the Dota 2 game is like a small ancient war, with the goal of fighting for territory and trying to defeat its opponents.

Artificial intelligence needs to process all the data in the game in order to proceed at a faster speed, which is a huge challenge. In order to train their algorithms, OpenAI must have a lot of processing power, using about 256 GPU and 128000 CPU. This is why IT experts often discuss and study OpenAI Five as an engineering and research project: getting the system to work properly is challenging, let alone defeating human players.

"OpenAI Five is more impressive than DQN or AlphaGo in terms of the complexity that modern data-driven artificial intelligence methods can handle." Andrey Kurenkov, a doctoral student in computer science at Stanford University, points out. Kurenkov says that while these older projects introduce important and novel ideas at a purely research level, OpenAI Five mainly deploys existing structures in previously incredible situations. Both the scale and the winning or losing are very large.

Earlier this year, OpenAI Five beat a team of amateur gamers as a benchmark of its abilities.

Robots still lack game plans.

But from the engineering point of view, the artificial intelligence robot lost these two games, so the robot is not good enough? The answer is: still very good.

In the past year, artificial intelligence robots have gradually mastered more complex rules of the game, starting with an one-on-one competition and finally reaching the 5v5 competition. However, they are still unable to cope with the complexity of the game. In international competitions, some restrictions have been lifted. Interestingly, robots no longer have unassailable messengers (NPC that deliver items to players). These used to be an important pillar of their game style, because access to a healing potion can help them carry out continuous attacks. In some games, they have to worry about the cancellation of their supply lines.

Whether robots master long-term strategy or not is a key issue.

Although the two games are still being analyzed, the initial consensus is that robots play well, they all have their own advantages and disadvantages, and human players can take advantage of their weaknesses to gain an advantage in the game.

These two games have a very high level, human players first take the lead in the game, then robots, and finally human players win. But in these two games, once human players gain a considerable advantage, they will find it difficult for robots to recover. Game commentators speculate that this may be because artificial intelligence is preferred to "get a point with 90% certainty rather than 50 points with 51% certainty". (this feature is also evident in AlphaGo's game style. This means that OpenAI Five is used to study stable but predictable victories. When robots lose their lead, they cannot take the necessary risks to regain victory.

The second international match scene of OpenAI Five

But it's just a guess. As with the application of artificial intelligence, it is impossible to guess the exact thought process behind the robot. All we know is that these robots perform well in the short term in the game, but it is very tricky to compete with humans in the long term.

OpenAI Five's judgment is very accurate, can actively pick targets through spells and attacks, and usually pose a threat to the opponents they encounter. Mike Cook, an artificial intelligence game researcher at Falmouth University, and an avid Dota player broadcast the battles live, describing the robot's style as "hypnosis."they act precisely and clearly." "normally, after winning a battle, human players will relax their guard a little bit and expect the enemy team to retreat and regroup," Cook said. "but the robots don't do that. If they see a chance to win, they will keep attacking."

In a long game, robots seem to be hobbling along, thinking about how difficult it is to gain an advantage in a 10-or 20-minute game. They played against a Chinese professional game team in the second of the two games, which chose an asymmetrical strategy. One player collects resources to continuously strengthen the team, while the other four attack or interfere with the robot team. However, the robot doesn't seem to notice what's going on, and at the end of the game, the human team will have a super player who wipes out artificial intelligence opponents. "this is the way humans play Dota games. But for robots, this is extremely long-term planning."

This strategic problem is not only important for OpenAI, but also more important for the research of artificial intelligence. The lack of long-term planning is often seen as a major flaw in enhanced artificial intelligence learning, as artificial intelligence created using this approach often emphasizes timely payments rather than long-term returns. This is because it is difficult to build a reward system for long-term work. If it is impossible to predict when this will happen, how can robots be trained to postpone the use of powerful spells until the enemy gathers together? Or is it just because you didn't use a spell and didn't give you a small reward? What if the robot decides never to use it? This is just a basic example. Dota 2 games usually last 30-45 minutes, and players must constantly think about what actions will lead to long-term success.

However, it is important to emphasize that the behavior of these robots is not just carelessness or seeking reward. The neural network that controls each player has memory components that learn certain strategies. They respond to rewards by considering future returns as well as more immediate gains. In fact, OpenAI says its artificial intelligence agent executes far more than any other similar system, with a "reward half-life" of 14 minutes (roughly the amount of time a robot can wait for future returns).

Kurenkov has written a number of articles about the limitations of reinforcement learning. He says the competition shows that reinforcement learning can handle situations that are more complex than most artificial intelligence researchers think. But he added that the failure of the competition showed the need for a new system to manage long-term thinking. (unsurprisingly, the chief technology officer of OpenAI disagrees. )

Unlike the result of the game, there is no obvious conclusion here. The disagreement over the success of robots reflects bigger unsolved problems in artificial intelligence. As researcher Julian Togelius points out on Twitter, "how can we start to distinguish between long-term strategies and behaviors that look like long-term strategies? does it matter? what we now know is that artificial intelligence cannot surpass humans in this particular area."

Dota 2 games offer more than 100 different game characters with a variety of capabilities, and artificial intelligence has not fully mastered them.

An unfair competitive environment

Arguing about robot ingenuity is another matter, but OpenAI Five's participation in the Dota 2 competition also raises a more fundamental question: why are we holding these events?

Take Gary Marcus's comments as an example, he is a critic of the limitations of contemporary artificial intelligence. In the run-up to the OpenAI game last week, Marcus pointed out on Twitter that this was unfair to human players. Unlike human gamers (or some other artificial intelligence systems), robots don't actually watch computer screens to operate. Instead, they use Dota 2's "bot API" to understand games. This is a 20000-digit protocol that describes changes in digital form, including all information about each player's location, health, spells, and attack time.

As Marcus said, this quickly solves the challenging problem of scene perception and provides a huge advantage for robots. For example, they don't have to search the map to find out where their opponent's team is, or watch the user interface to see if their most powerful spells are ready. They don't have to guess their opponent's health or estimate their distance, they all know this information.

But is this cheating?

There are several ways to answer this question. First, OpenAI can create a vision system to read pixels and retrieve the same information provided by the robot API. The main reason for not doing so is that it is very resource-intensive. ) it's hard to tell because no one knows whether it will work or not until someone actually does. But it may not matter. The more important question may be: can there be fair competition between humans and machines? After all, if we want to understand how human players play Dota 2, do we need to equip the OpenAI Five with a robot to operate the mouse and keyboard?

These questions are a bit funny, but they highlight the possibility that it is difficult to establish a truly level playing field between humans and computers. Such a thing does not exist, because do we need machines to think like humans, just like airplanes flying like birds? As artificial intelligence game researcher Cook said: "of course, computers are better than humans in some ways. That's why we invented computers."

"maybe we need to think more deeply about why these events are held," Brockman said. "there is more than games. This is not the reason we play Dota games. We do this because we think we can develop artificial intelligence technologies that can power human development in the coming decades."

This ambitious claim is true. The system used to train OpenAI Five is a system called Rapid that is being used in other projects. For example, OpenAI has used it to train robot manipulators to manipulate objects with human-like flexibility. Artificial intelligence also has its limitations. Rapid is not an omnipotent algorithm. But the general principle is that the work needed to achieve any goal, such as defeating humans in video games, helps to stimulate the development of artificial intelligence.

South Korean go player Lee se-dol was defeated by AlphaGo in 2016, but he learned some new skills.

It can also help humans who are challenged by machines. For example, the most intriguing part of AlphaGo's story of defeating the go world champion is that although go champion Lee se-dol was defeated by an artificial intelligence system, he and other members of the AlphaGo community also learned a lot of experience and skills. AlphaGo's game style subverts centuries of accepted wisdom, and its behavior is still being studied, and Lee se-dol after the match against AlphaGo. Winning consecutive games against other human chess players.

The same thing is already happening in the Dota 2 game world: players are studying the OpenAI Five game process to discover new tactics and actions. There is at least one previously undiscovered game mechanism that allows players to quickly replenish certain weapons away from their opponents, and the discovery of robots will benefit human players. As artificial intelligence researcher Merity said: "I really want to sit down and watch these games so I can learn new strategies. And people who are studying these things will say, 'this is what we need to put into the game.'"

This phenomenon of artificial intelligence training may become more common in the future. In some ways, it seems to be an act of kindness. Robots not only surpass human capabilities, but also provide a gift.

Of course, this is not true. Artificial intelligence is just another method of self-education invented by human beings. But that's why we play. For human players and machines, this is a profound learning experience.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report