Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The 0 threshold clone ChatGPT,30 finished its minute training, and the performance of 6 billion parameters is comparable to that of GPT-3.5

2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

Deciphering "CloseAI", ChatGPT cloned sheep came out! 0 threshold to achieve "self-research", from then on the big language model is no longer just the "golden finger" of a small number of big companies.

Prior to this, the incident that OpenAI does not Open has caused a lot of controversy among the community.

Just releasing the benchmarks and test results without providing training data, costs and methods is really going to be a winner-take-all.

It seems that the big language model is about to be monopolized by giant companies, and now it suddenly kills a start-up and gives OpenAI a shot-- using the 6 billion-parameter "Dolly" to achieve similar capabilities to ChatGPT.

Yes, we only need to prepare some high-quality training data, and then randomly take a large open source language model, after 30 minutes of training, we can get a ChatGPT "replacement"!

In this regard, Databricks proudly said that the release of Dolly is the first bullet on the road to democratization of artificial intelligence technology.

6 billion parameters are comparable to ChatGPT,30 training in minutes because ChatGPT consumes a lot of data and computing resources (using tens of thousands of words to train and consume a lot of GPU), so this kind of big language model is destined to be mastered by only a small number of giants.

In contrast to "CloseAI", Meta released a set of high-quality (but not instruction-following) language models LLaMA to academia in March, each with more than 80000 GPU hours of training.

Stanford University then built Alpaca based on LLaMA, but the difference is that it is fine-tuned with a small data set of 50000 questions and answers. Surprisingly, this makes Alpaca interactive like ChatGPT.

And Dolly is inspired by Alpaca.

More interestingly, instead of taking advantage of the latest model, Dolly, with 6 billion parameters, chose GPT-J, an open source model released in 2021.

Because Dolly itself is a "clone" of a model, the team finally decided to name it "Dolly"-the first animal ever cloned.

Compared with current large language models such as GPT-3, Dolly allows users to use smaller, more professional models, the ability to "replicate" ChatGPT.

After all, for those subdivided users, being able to make use of models that have been fine-tuned for the industry can greatly increase performance and accuracy.

Although Databricks does not compete directly with OpenAI, it seems to be trying to steal the limelight from OpenAI by proving that building services like ChatGPT is not as difficult as it seems.

In particular, OpenAI has adopted a "bigger the better" approach to developing language models and has become increasingly secretive about its work.

In addition to releasing Dolly as open source software, Databricks also emphasizes that Dolly has only 6 billion parameters (the part of the language model that is fine-tuned during training), while OpenAI's GPT-3 model has 175 billion parameters. (OpenAI did not disclose the number of parameters for GPT-4).

Let the old model, Nirvana Rebirth, evaluate Dolly based on the command-following ability described in InstructGPT's paper and find that it behaves very similar to ChatGPT in many abilities, including text generation, brainstorming and open question and answer.

In these examples, what is worth noting is not the quality of the generated text, but the huge improvement in instruction-following ability to fine-tune an old open source model on a small, high-quality dataset.

Content generation, for example, write a tweet posted by the Databricks official propaganda large-scale language model Dolly.

As you can see, the content generated by the original 6 billion parameter model (GPT-J) is that the donkey's lips are not right to the horse's mouth, while Dolly gives a fully available tweet--

Not only does the content meet the requirements, but it is also affectionately tagged and reminds you to remember to join the published link.

For this question, the answer given by ChatGPT is also in line with expectations, and the tweets given by Dolly,ChatGPT contain more critical words than those given by Dolly,ChatGPT, and the tags are more accurate and specific, but the overall gap is not large.

When you write an advertisement for selling Nikon DMel 750 cameras, you can see that the content generated by GPT-J is basically made up, making up the plot of buying and selling cameras like writing a novel.

On the other hand, Dolly gives an attractive advertisement for the resale of the camera according to the characteristics and advantages of the Nikon DMI750 camera, but unfortunately the pixel parameters are not right.

ChatGPT also successfully completed the task on this topic, highlighting the advantages of the camera in the advertisement, and still affectionately tagged at the end of the article.

Last question: write a love letter to Edgar Allan Poe (Allan Poe).

In this regard, the ancient GPT-J refused to answer directly, the reason is that Poe has passed away, you can't write love letters to the dead.

While Dolly successfully completed the task, the effect can be called "Nirvana" by comparison.

And this kind of "creative" problem is obviously the strength of ChatGPT, writing more than 300 words.

On the question-and-answer test of factual questions, the team chose the following: "explain to me the difference between fission and fusion." "

Right or wrong, GPT-J is all about the sun. Although it mentions the word "fusion", it completely ignores "fission".

The first sentence of Dolly points directly to the point that the difference between fission and fusion lies in the way energy is released, and then briefly explains the difference between them.

By contrast, ChatGPT's answer is significantly more detailed.

While brainstorming asked them to brainstorm and give a list of five science fiction books to read, GPT-J just mumbled to himself, as if immersed in the guilt of procrastination, avoiding the question altogether.

Dolly is as stable as ever, giving the titles and authors of five undergraduate fantasy novels in accordance with the instructions.

ChatGPT gives a richer answer to this question, including not only the title and author of the book, but also a brief comment and introduction to the content and type of each book.

If you want Close, I will Open. For many companies, they would rather build a less strong model than send data to large language model vendors that only provide API.

One of the important reasons is that these issues and data sets are the company's most sensitive and proprietary intellectual property rights, and it is obviously unreliable to hand them over directly to third parties.

In addition, the company itself may have different tradeoffs in terms of model quality, cost, and expected behavior, and a customizable language model is more in line with the requirements.

Now, the release of Dolly gives them hope that even an "outdated" open source large language model (LLM) can give it magical ChatGPT-like instruction-following capabilities with 30 points of training.

It is not hard to imagine that the big language model may not be the exclusive game of the AI giant companies soon!

As the company CEO Ali Ghodsi says, "our belief is to make these technologies available to every organization in the world." "

Reference:

Https://www.databricks.com/blog/2023/03/24/hello-dolly-democratizing-magic-chatgpt-open-models.html

Https://venturebeat.com/ai/databricks-debuts-chatgpt-like-dolly-a-clone-any-enterprise-can-own/

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report