Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

ChatGPT's real parameters are only 20 billion, which is exposed by Microsoft for the first time! Netizen: no wonder OpenAI is nervous about open source.

2025-03-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Thanks to CTOnews.com netizens Alejandro86 and Huake high achiever for their clue delivery! Suddenly, the whole big model circle is talking about the same thing. A statistical chart "not surprising at first glance" in Microsoft's paper revealed the "secret". Behind the ChatGPT, which leads the global storm, the parameters of the large model are only 20 billion?

As soon as the paper was published, it attracted a lot of attention at home and abroad.

Many netizens still don't believe it: are you sure it's not spelled wrong?

One netizen said: no wonder OpenAI is so nervous about open source. Or, it may be in preparation for OpenAI open source.

Coincidentally, just a few days ago, some netizens found a suspected new model of GPT-4 in GitHub Copilot's API: copilot-gpt-4-2, and the knowledge was updated to March 2023.

What does this paper say? In addition to leaking secrets, the paper itself is worth a look: the first in the industry to use a diffusion model for code generation.

The research team imagined a scenario like this:

If a developer can only modify the last line of code, how many times does it take to write a function from scratch?

Autoregressive models that generate code in natural languages have similar limitations: it is not easy to reconsider previously generated tokens.

Microsoft researchers proposed a CODEFUSION based on coding-decoding architecture, which mainly includes encoder, decoder, denoiser and Classification Head, which encodes the natural language input into continuous representation, and then inputs its additional conditions into the Diffusion model for iterative denoising with Gaussian noise.

In order to generate the code with correct syntax, the code tokens is obtained by inputting the code into the decoder after denoising, and the CODEFUSION is pre-trained for the successive paragraph denoising (CPD) task of the code.

CODEFUSION is evaluated on three language tasks: Python, Bash, and Excel conditional formatting (CF) rules.

The results show that the performance of CODEFUSION with 7500 million parameters is close to that of GPT-3.5-turbo with 20 billion parameters, and more diversified codes are generated.

Compared with the diffusion model generated by plain text, CODEFUSION generates more syntactically correct code and more diverse candidate code than the automatic regression model.

Compared with the most advanced autoregressive system (350M-175B parameter), it is similar to the accuracy of the top 1, while in terms of the accuracy of the top 3 and top 5, it is better than the autoregressive system because of a better balance between diversity and quality.

As a result, this was only a normal performance comparison, but it caused an uproar.

Some people have also started the conspiracy theory, maybe this is the "antecedent" of OpenAI's open source, deliberately doing it--

Because many big models have caught up, and as early as May this year, Reuters revealed that OpenAI is preparing to open source a new big language model.

It is worth mentioning that as early as February this year, a Forbes news report revealed that the ChatGPT has only 20 billion parameters.

At that time, the title was "the bigger the better?" Why is ChatGPT VS GPT-3 VS GPT-4 's "fight" just a family chat? "

It's just that not many people cared at that time.

Reference link:

[1] https://twitter.com/felix_red_panda/status/1718916631512949248

[2] https://x.com/teortaxesTex/status/1718972447024623898?s=20

[3] https://www.reddit.com/r/singularity/comments/17jrepb/microsoft_paper_claims_chatgpt_35_has_20_billion/

[4] https://www.zhihu.com/question/628395521

[5] https://www.reddit.com/r/ChatGPT/comments/17ht56t/new_leaks_about_upcoming_developments_with_openai/?share_id=txV27HR0zw0TjV8dLXf4l

[6] https://www.forbes.com/sites/forbestechcouncil/2023/02/17/is-bigger-better-why-the-chatgpt-vs-gpt-3-vs-gpt-4-battle-is-just-a-family-chat/amp/

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report