AI can already learn to watch anchors' videos and teach themselves how to play games. 02/15 Update SLTechnology News&Howtos

AI can already learn to watch anchors' videos and teach themselves how to play games.

2026-02-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

At the end of June, OpenAI, a well-known technology company, released a paper focusing on an AI technology called "video pretraining" (VPT:Video PreTraining).

The results of the study are encouraging. For example, after watching the "my World" video for more than 70,000 hours, AI in the case has successfully learned most of the skills necessary for survival: swimming, hunting, building houses, mining, and even raiding villages.

Although there are still some operations that are difficult for humans to understand, as a result, this has performed much better than many similar AI.

After finding something, AI excitedly rolled up the ceiling, of course, compared to the mountain of code behind and the "reverse power model" and other confused technical words, as ordinary players, we may be more concerned about when such a highly intelligent and interesting AI will actually be loaded into the game.

"give me one, too."

one

There is no need to wait, the scene of AI walking into thousands of homes is right now.

Although OpenAI's model has only been submitted to the MineRL contest dedicated to studying the "my world" AI, just days before and after their paper was published, another AI with similar functions also appeared online. More importantly, the research team put their code directly on Github for everyone to download and study.

MineDojo's Github page, a project called MineDojo, is developed by Nvidia engineers and is also trained by watching videos online, but unlike OpenAI, their database is much larger.

MineDojo has collected 730000 YouTube game videos, more than 7000 wiki pages, and even millions of Reddit comments related to "my World".

The purpose of "Internet scale" is, of course, to help AI understand the meaning of words such as "build" and "survive" in the human context. In instructional videos, tubing owners teach viewers where to start, where to find temples, and how to attack the final Shadow Dragon.

For AI, this is a good "online lesson".

This behavior is supported by a learning algorithm called MineCLIP. It can help AI to connect the anchor's commentary with the operations shown in the video, so as to achieve the purpose of training; similarly, the trained AI can also understand the tasks directly assigned by the player.

This is the most interesting part of MineDojo, and engineers have prepared 3000 instructions that can be given directly to AI, either programmatic tasks, such as "survive three days" or "collect two pieces of wood", which can be measured objectively in numbers and nouns, and abstract tasks, such as "build Beachfront Villas."

It may be hard for AI to understand what "beautiful", "seaside" and "villa" mean, but after explaining the video and searching for keywords in players' comments, AI can achieve its goal in a decent way most of the time.

During these quests, players can give AI orders to "round up cattle and sheep,"go to the swamp to find chickens,"live as long as possible," or simply ask it to search an undersea temple. Because of the language commonly used on the Internet, AI is quite good at learning certain human-specific sense of humor.

Compared with the OpenAI model, the technical difficulty of MineDojo may not be so high. After all, it is directly connected to the game port, and it is much easier to use the data in the game to directly control AI action; while OpenAI has established a human action model from scratch, and the instructions are to directly simulate human keyboard and mouse operations.

And MineDojo in part of the time still have to modify the game data to achieve the goal, such as the strategy of the last shadow dragon, only "cheating" to let the last shadow dragon stand in place to be beaten in order to pass.

Cruel videos of cannibalism, however, MineDojo still shows the ability of AI to learn through existing videos and materials. The only pity is that we haven't seen much feedback from the actual installation of MineDojo, so there are some doubts about the actual effect. The advantage is that it is available for everyone to download free of charge, so it's okay to try it as a free material for getting started with AI.

Thanks to the development of the contemporary Internet, AI can get the knowledge it wants from video materials. The same is true of humans, making an AI that can play games, and sometimes watching videos is enough.

The video goes further than the textbook, even if the audience does not understand anything, do not know what the python language, architecture, Monte Carlo algorithm is, everyone can still have fun from the video, and then imperceptibly understand the knowledge.

Video makers who are dedicated to designing game AI play a representative role in this field.

The first thing to mention is perhaps more familiar with the "genetic algorithm", a technology proposed in the 1960s and carried forward by this century.

It is similar to the biological theory of evolution, specifically, it is through the system to generate a bunch of babies who know nothing, let them try various operations in the natural (program) world, through the selection of better offspring, to achieve the goal of constantly optimizing the performance of AI.

Take a video of "AI learn to play JUMP KING" on the tubing as an example, and this is roughly the scene.

Mr. Code Bullet (hereinafter referred to as CB) has successfully produced many videos of AI clearance games with this algorithm, and games like "Pac-Man" and "Flappy Bird", which emphasize the optimization of AI actions, can go on in a similar way.

The train of thought is clear, and it is very "simple" to do. Looking through most of CB's videos of making AI, you can see that his process is mainly divided into three parts.

"it takes only three steps to make a game-playing AI." the reasons for redoing the game will be discussed later. Part of the essence of the CB video lies in the "filtering" function of genetic algorithms. Unlike the nature of natural selection, here we are the God in charge of selecting AI.

The newborn AI is of course a baby who doesn't know anything. Add action instructions to them, and AI won't know the point of where to act. Therefore, the common practice is to set rewards and punishments for random AI actions, such as 1 point for each jump, 2 points for reaching the next level, 0. 5 points for moving left and right, and 1 point for falling.

"go up is good, go down is bad, this is very simple." each generation of AI has only five opportunities to act. After five actions, the AI with the highest jump will become a model for the next generation. After that, each generation of AI will follow the previous generation to find the best path-this is a very simple evolution.

However, such a simple rule has no way to solve some "thinking" problems. If a certain level needs to fall first and then jump up, the stubborn AI will refuse to jump because of the deduction principle.

The solution could be to set up a collection at the landing site that can also provide rewards, guiding AI to a higher scene by collecting rewards.

In fact, the way the game guides players is very similar to when all the programs are ready, just let AI run on its own, and they will naturally find the best route from generation to generation, and finally complete the task of game clearance.

After 862 generations, it has been five years since AlphaGo retired from the game in 2017. Since then, "civilian AI" has made a splash in the gaming field, and there are plenty of bloggers playing "VALORANT", "Monopoly" and "Sugar Pac Man" with AI on the tubing.

Although there is no financial support from the company and no graduate students who shed blood and tears to help label data, thanks to the openness of Github, each netizen can easily access a lot of trained neural network programs.

Take River, a young blogger who has only 7000 followers on the oil pipe, for example, his first video succinctly shows the low threshold of AI technology.

The preparation is simple: all you need is two computers, a program downloaded from the Internet, a video capture card, and a wireless mouse signal receiver.

All that needs to be done is to mark some pictures for AI to train his recognition ability, a "small" code indicating behavior patterns, then directly scan the mini-map to indicate directions, and then transmit keyboard signals to the computer through a wireless mouse.

Although the signal transmission is a bit troublesome, but there are benefits, because there is no additional program access to the game, naturally will not be judged to be using a plug-in.

Everything is done by another computer based on real-time images, of course, in the current performance, River's AI is similar to the ordinary AI robot, and does not have the magical self-evolution ability of AlphaGo.

However, just want to simply experience AI design, there is no such a high threshold. It is also fun to constantly design newer and stronger AI, one of which is to distinguish the "boundary" between right and wrong.

That is human (sure) just as MineDojo wants to distinguish between stylized tasks and abstract tasks, when we teach AI, we can also get our own definition of things and the resulting interpretation from the results of AI discrimination, which may inspire human beings to solve contradictions in life.

Friends ask you how you are today and how to introduce yourself to each other on a blind date. If every question can be answered programmatically, it is a sign that human beings have evolved to a higher level.

Who trained me and who did I train? This article is from the official account of Wechat: game Research Society (ID:yysaag), author: RMHO

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.