Microsoft launched the "learning from mistakes" model training method, which claims to "imitate the human learning process and improve the reasoning ability of AI". 04/26 Update SLTechnology News&Howtos

Microsoft launched the "learning from mistakes" model training method, which claims to "imitate the human learning process and improve the reasoning ability of AI".

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

CTOnews.com, Nov. 7, Microsoft Research Asia, in conjunction with Peking University, Xi'an Jiaotong University and other universities, proposed an AI training method called "Learning from mistakes (Learning from Mistake,LeMA)", which claims to improve AI reasoning ability by imitating the human process of learning knowledge.

Today, large language models such as OpenAI GPT-4 and Google aLM-2 perform well in natural language processing (NLP) tasks and chain-of-thought,CoT reasoning mathematical puzzle tasks.

However, large open source models such as LLaMA-2 and Baichuan-2 need to be strengthened when dealing with related problems. In order to improve the thinking chain reasoning ability of these large open source language models, the research team proposed the LeMA method. This method mainly imitates the human learning process and improves the reasoning ability of the model by "learning from mistakes".

CTOnews.com found that the researchers' approach was to fine-tune the correlation model using a pair of data containing "wrong answers" and "correct answers after correction". In order to obtain the relevant data, the researchers collected the wrong answers and reasoning processes of five different large language models (including LLaMA and GPT series), and then used GPT-4 as the "corrector" to provide correct answers.

It is reported that the revised correct answer contains three types of information, namely, the error fragments in the original reasoning process, the causes of the errors in the original reasoning process, and how to modify the original method to get the correct answer.

The researchers used GSM8K and MATH to test the effect of LeMa training method on five open source large models. The results show that, taking the improved LLaMA-2-70B as an example, the accuracy is 83.5% and 81.4% in GSM8K, 25.0% and 23.6% in MATH.

At present, researchers have published the relevant information about LeMA on GitHub, and interested friends can click here to jump.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.