The decision-making ability of AI broke through again, Tencent reached the top of the Japanese mahjong platform 04/28 Update SLTechnology News&Howtos

The decision-making ability of AI broke through again, Tencent reached the top of the Japanese mahjong platform

2025-04-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

On July 11, Tencent announced that the self-developed chess category AI "Juyi LuckyJ" had reached a 10-paragraph level on the internationally renowned mahjong platform "Tianfeng", refreshing AI's best performance in the field of mahjong. "Juyi LuckyJ" shows the excellent decision-making level in imperfect information games and further improves AI's ability to solve real-world problems.

Japanese online mahjong competitive platform "Tianfeng" was founded in 2006, with systematic competitive rules and professional Rank rules, widely recognized by the professional mahjong community. So far, the active number of Tianfeng platform is 238000, while only 27 people (including AI) can reach 10 segments, less than 1/10000.

Compared with other mahjong AI and human players, "unique skill LuckyJ" not only has a higher stable Rank, but also requires significantly fewer games to reach 10 stages from scratch, requiring only 1321 innings. This reflects the world-leading technical strength of Tencent AI Lab in deciding the direction of AI.

In a statistical sense, the distribution of Tianfeng stable Rank bootstrap is significantly stronger than that of the previous two strongest Japanese mahjong AI (Suphx,NAGA): LuckyJ vs Suphx p value=0.02883;LuckyJ vs NAGA p value=3e-05.

Tencent AI Lab researcher said that the real world is full of scenarios where decisions need to be made in imperfect information, such as financial transactions, autopilot, traffic logistics, auction systems and so on. The ability to promote decision-making AI in the game environment is to hope that AI can move from virtual to reality and solve complex problems in the real world.

In the past half century, games have played an important role in the evolution of artificial intelligence technology. The various situations of the game provide convenient research scenes for the training and learning of AI. From chess to go, to games such as Texas hold'em and Arena of Valor, AI continues to expand the boundary of capabilities in the game scene.

Go and chess belong to the perfect information game, and the overall information can be seen every time they participate in the decision-making of both sides. AI can enumerate various possibilities through powerful computing power, so as to find the winning strategy. Mahjong can not see the opponent's hand, coupled with a large number of unopened cards, there is a lot of hidden information, which is a typical imperfect information game.

According to reports, mahjong a total of 136 cards, each player can only see a few cards, including their own 13 cards and all the cards played. At the beginning of the game, the other three players' hands and wall cards are invisible, in the face of so much hidden unknown information, mahjong players need to strike a balance between offense and defense.

In addition, in the game of mahjong, in addition to touching and playing cards normally, we also have to decide whether to eat cards, touch cards, lever cards, stand upright and whether or not Hu cards. Any player's bumping will change the order of touching cards, and this process also involves a lot of decisions.

As shown in the figure above, the number of Abscissa information sets represents the number of observable states, that is, the information of the card face. The average size of the ordinate information set indicates the amount of hidden information, that is, the possibility of all other opponents' hands. Mahjong contains far more hidden information than Texas hold'em.

In order to better solve the problem of hiding a large amount of information in mahjong games and improve the decision-making ability of AI, Tencent AI Lab is based on reinforcement learning and self-game technology of minimizing regret value, so that AI can learn and improve itself from scratch, and finally converge to the strongest mixed strategy, so that AI has a more balanced strategy ability in the actual battle.

At the same time, considering that the traditional imperfect information search algorithm is difficult to play too much role in mahjong, Tencent AI Lab proposes an efficient imperfect search method based on the idea of optimistic value estimation, so that AI can still adjust the current strategy in real time in the game state with massive hidden information, so as to better cope with the changeable war situation.

Compared with human beings, "unique skills LuckyJ" in mahjong games, has a more balanced strategy, more accurate calculation of the situation, including the expected income of each card, which types of Hu may be in the future, and so on. Through such "strategy" training, it has also laid the foundation for AI to enter more industries.

Tian Zhenwu, CEO corner of Tianfeng platform development company C-EGG, said: "this is another breakthrough for mahjong AI, and LuckyJ further broadens the capability boundaries of mahjong AI. It is exciting that LuckJ ranks first in the stable Rank of all players playing more than 1000 games in special rooms, including human players."

Yousei, a Japanese mahjong tactical researcher who has made an in-depth study of the LuckyJ history game online, commented that LuckyJ gives people the impression that it sees "parameters of attack and defense" on each card. On the whole, LuckyJ seems to have "no loopholes at all." on the one hand, it reduces the accident rate by keeping security cards and other strategies. On the other hand, even if there are multiple sum directions in the hand, LuckyJ can proceed smoothly in these complex branches.

It is worth mentioning that "unique skill LuckyJ" also has an outstanding performance in the national standard mahjong, beating six professional players in the offline professional players invitational tournament, becoming the first mahjong AI to beat the top mahjong professionals in the national standard mahjong.

Note: the match data show that in the last 2000 games, the average win of Juyi LuckyJ has reached 1.76 times, which is the settlement unit of national standard mahjong. The higher the number, the more you win.

Chess players who have played against Juyi LuckyJ also spoke highly of it. Cheng Haihua, winner of the 2014 World Mahjong Masters Invitational tournament and the winner of the annual final of Tencent Mahjong Championship (2018, 2019), mentioned that AI performed very well at both ends of attack and defense, fully reflecting his computational advantages and impressed him.

Yang Lei, the national standard mahjong professional player who boasts the president of the mahjong sports association, also feels the same way: "after months of testing against Tencent mahjong AI, through the analysis of the AI game, I am impressed both offensively and defensively. What we usually call mastery, brilliance, and even choices based on experience and feeling may be routine operations for AI."

National standard mahjong and Japanese mahjong professional Huang Lin said that in thousands of battles with AI, he has been amazed at AI's strong card effect and accurate reading, describing it as "extreme in both attack and defense."

Decision-making and generation are not only the two main lines of the development of artificial intelligence, but also the only way to study general artificial intelligence. In virtual games that simulate the real world, AI learns to analyze, make decisions and act quickly, so that it can perform more difficult and complex tasks and play a greater role. Since 2017, the two decision-making AI developed by Tencent AI Lab have explored the use of AI to solve complex problems in reality with the help of chess, card, MOBA and other game scenarios.

There are a lot of hidden information and uncertain factors in real life, and the complex decision-making process and random games in mahjong games are closer to real life than perfect information games such as go. The breakthrough of "Juyi LuckyJ" in the professional field reflects the continuous evolution of Tencent AI Lab's deep reinforcement learning agent, which is gradually moving to solve more complex and diversified problems. The research on imperfect information games will help us to develop a more "intelligent" AI system suitable for real-life scenarios.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.