The shocking leak of LLaMa triggered a frenzy of ChatGPT replacement, and the field of open source LLM changed. 04/09 Update SLTechnology News&Howtos

The shocking leak of LLaMa triggered a frenzy of ChatGPT replacement, and the field of open source LLM changed.

2025-04-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Meta's LLaMA model is open source, so that the large text model ushered in the Stable Diffustion moment. No one expected that an "epic" leak of LLaMA had produced a series of stunning ChatGPT "substitutes".

Who would have thought that an unexpected LLaMA leak would ignite the biggest innovation spark in the open source LLM field.

A series of outstanding open source alternatives to ChatGPT, the Alpaca Family, then made a dazzling debut.

The friction between open source and API-based distribution is one of the most pressing contradictions in the generative AI ecosystem.

In the text-to-image domain, the release of Stable Diffusion makes it clear that open source is a viable distribution mechanism for the underlying model.

However, this is not the case in the field of large language models, where the biggest breakthroughs, such as GPT-4, Claude, and Cohere models, can only be achieved through API.

Open source alternatives to these models do not show the same level of performance, especially in the ability to follow human instructions. However, an unexpected leak completely changed the situation.

LLaMA's "epic" leaked a few weeks after Meta AI launched the big language model LLaMA.

LLaMA comes in different versions, including parameters of 7B, 13B, 33B, and 65B, and although it is smaller than GPT-3, it is comparable to GPT-3 in many tasks.

LLaMA was not open source at first, but a week after its release, the model was suddenly leaked on 4chan, triggering thousands of downloads.

This event can be called an "epic leak" because it has become an endless source of innovation in the field of large language models.

In just a few weeks, the innovation of LLM agents based on it has exploded.

Alpaca, Vicuna, Koala, ChatLLaMA, FreedomGPT, ColossalChat. Let's review how the big explosion of the "alpaca family" was born.

Alpaca in mid-March, Alpaca, a big model released by Stanford, went viral.

Alpaca is a new model fine-tuned by Meta's LLaMA 7B, using only 52k data and about the same performance as GPT-3.5.

The key is that the cost of training is extremely low, less than $600.

Stanford researchers compared GPT-3.5 (text-davinci-003) and Alpaca 7B and found that the performance of the two models was very similar. In Alpaca's comparison with GPT-3.5, the number of wins was 90 to 89.

For the Stanford team, if they want to train a high-quality instruction compliance model within budget, they must face two important challenges: to have a strong pre-training language model, and a high-quality instruction compliance data.

Precisely, the LLaMA model provided for academic researchers solves the first problem.

For the second challenge, the "Self-Instruct: Aligning Language Model with Self Generated Instructions" paper gives a good inspiration, even if the existing strong language model is used to automatically generate instruction data.

The biggest weakness of the LLaMA model is the lack of instruction fine-tuning. One of OpenAI's biggest innovations is the use of instruction tuning on GPT-3.

In response, Stanford uses the existing large language model to automatically generate a demonstration of following instructions.

Now, Alpaca is directly regarded by netizens as "the Stable Diffusion of the big model of text".

Vicuna3 at the end of the month, researchers from UC Berkeley, Carnegie Mellon, Stanford University, and the University of California, San Diego, opened up Vicuna, a fine-tuned version of LLaMA that matches GPT-4 performance.

Vicuna, with 13 billion parameters, comes from fine-tuning LLaMA on user sharing conversations collected by ShareGPT, which costs nearly $300.

The results show that Vicuna-13B can compete with ChatGPT and Bard in more than 90% of the cases.

For the Vicuna-13B training process, the details are as follows:

First, the researchers collected about 70K conversations from ShareGPT, a ChatGPT conversation-sharing site.

Next, the researchers optimized the training scripts provided by Alpaca so that the model could better handle multiple rounds of conversations and long sequences. After that, we used PyTorch FSDP to train on 8 A100 GPU for one day.

In terms of the quality evaluation of the model, the researchers created 80 different questions and evaluated the model output with GPT-4.

To compare different models, the researchers combined the output of each model into a separate prompt, and then asked GPT-4 to evaluate which model gave a better answer.

Comparison of LLaMA, Alpaca, Vicuna and ChatGPT Koala recently, UC Berkeley AI Research Institute (BAIR) released a new model, "Koala", which uses OpenAI's GPT data for instruction fine-tuning. Koala is different from using high-quality data obtained from the network for training.

The research results show that Koala can effectively answer queries from various users, and the answers generated are often more popular than Alpaca, and at least half of the cases are comparable to those of ChatGPT.

The researchers hope that the results of this experiment will further promote the discussion about the relative performance of large closed-source models compared to small public models, especially for small models that can run locally. If the training data are carefully collected, the performance of the large model can also be achieved.

In fact, the previous Alpaca model released by Stanford University, the experimental results of fine-tuning LLaMA data according to OpenAI's GPT model, have shown that the correct data can significantly improve the smaller open source model.

This is also the original intention of Berkeley researchers to develop and release the Koala model, hoping to provide another experimental proof of the results of this discussion.

Koala fine-tunes the free interactive data obtained from the Internet, and pays special attention to data that interacts with high-performance closed-source models such as ChatGPT.

Instead of crawling as much network data as possible to maximize the amount of data, researchers focus on collecting a small, high-quality data set, including ChatGPT distillation data, open source data and so on.

ChatLLaMANebuly has open source ChatLLaMA, a framework that allows us to create conversation helpers using our own data.

ChatLLaMA lets us create super-personalized ChatGPT-like helpers with our own data and as little computation as possible.

Suppose that in the future, instead of relying on a large "dominating all" assistant, everyone can create their own personalized version of the ChatGPT helper, which can support a variety of human needs.

However, creating such a personalized helper requires efforts in many ways: dataset creation, efficient training using RLHF, and reasoning optimization.

The purpose of this library is to provide developers with peace of mind by abstracting the work required to optimize and collect large amounts of data.

ChatLLaMA is designed to help developers deal with a variety of use cases, all of which are related to RLHF training and optimized reasoning. Here are some use case references:

Create ChatGPT-like personalized assistants for vertically specific tasks (legal, medical, gaming, academic research, etc.)

Want to use limited data on the local hardware infrastructure and train an efficient ChatGPT-like assistant

Want to create your own personalized version of the ChatGPT assistant while avoiding runaway costs

Want to know which model architecture (LLaMA, OPT, GPTJ, etc.) best meets my hardware, computing budget, and performance requirements

I want my assistant to be consistent with my personal / corporate values, culture, brand and manifesto.

FreedomGPTFreedomGPT, built using Electron and React, is a desktop application that allows users to run LLaMA on their local machines.

The characteristic of FreedomGPT is evident in its name-the questions it answers are not subject to any censorship or security filtering.

This program is developed by AI venture capital firm Age of AI.

FreedomGPT is built on top of Alpaca. FreedomGPT uses the distinctive features of Alpaca because Alpaca is relatively easier to access and customize than other models.

ChatGPT follows OpenAI's use policy, limiting hatred, self-harm, threats, violence and sexual content.

Unlike ChatGPT, FreedomGPT answers questions without bias or favouritism and does not hesitate to answer controversial or controversial topics.

FreedomGPT even answered "how to make a bomb at home," which OpenAI specifically removed from GPT-4.

FreedomGPT is unique because it overcomes censorship restrictions and caters to controversial topics without any guarantees. Its symbol is the Statue of Liberty, because this unique and bold language model symbolizes freedom.

FreedomGPT can even run locally on a computer without having to connect to the Internet.

In addition, the open source version will be released soon, so that users and organizations can be fully customized.

The ColossalChat proposed by ColossalChatUC Berkeley only needs less than 10 billion parameters to achieve bilingual ability in both Chinese and English, which is equivalent to that of ChatGPT and GPT-3.5.

In addition, ColossalChat, which is based on the LLaMA model, also replicates the complete RLHF process, which is the open source project closest to the original technical route of ChatGPT.

The Chinese-English bilingual training data set ColossalChat released a bilingual data set containing about 100000 Chinese and English question-and-answer pairs.

The dataset is collected and cleaned from real problem scenarios on social media platforms, and is used as a seed dataset to be extended using self-instruct at a cost of about $900.

Compared with other datasets generated by self-instruct methods, this dataset contains more realistic and diverse seed data, covering a wider range of topics.

This data set is suitable for fine tuning and RLHF training. In the case of providing high-quality data, ColossalChat can achieve better dialogue interaction, but also support Chinese.

The algorithm reproduction of a complete RLHF pipeline RLHF consists of three stages:

In RLHF-Stage1, the model is fine-tuned by using the above bilingual data set to fine-tune the supervision instruction.

In RLHF-Stage2, the reward model is trained to assign corresponding scores by manually sorting the different outputs of the same prompt, and then the training of the reward model is supervised.

In RLHF-Stage3, reinforcement learning algorithm is used, which is the most complex part of the training process.

It is believed that more projects will be released soon.

No one would have thought that this unexpected leak of LLaMA unexpectedly ignited the biggest innovation spark in the open source LLM field.

Reference:

Https://thesequence.substack.com/p/the-LLaMA%20%20-effect-how-an-accidental

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.