It's not Versailles. ChatGPT is so successful that OpenAI doesn't understand. 07/02 Update SLTechnology News&Howtos

It's not Versailles. ChatGPT is so successful that OpenAI doesn't understand.

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

The explosion of ChatGPT was something that OpenAI had never thought of. Now, they are still slowly adapting to the popularity and various problems of their models.

The company made products that might detonate the fourth industrial revolution, but they were puzzled: why are their own products so popular?

Just, it's really not Versailles.

Recently, MIT Technology Review interviewed several developers of ChatGPT to get a closer look at the story behind this blockbuster AI product.

When OpenAI quietly launched ChatGPT in late November 2022, the startup didn't expect much.

OpenAI employees did not think that their model is about to embark on a top-of-the-line road of popularity.

ChatGPT seems to have become a hit overnight and triggered a global gold rush about big language models, while OpenAI is unprepared and has no choice but to catch up with his top model and try to seize business opportunities.

Sandhini Agarwal, who works on policy at OpenAI, says that within OpenAI, ChatGPT has always been seen as a "research preview"-a more sophisticated version of the technology two years ago, and more importantly, the company is trying to eliminate some of the flaws in the model through public feedback.

Who would have thought that such a "preview" product would become popular after making its debut by mistake?

In this regard, OpenAI scientists are very confused, to the outside flowers and applause, they are also very awake.

"We don't want to exaggerate it as a huge fundamental step forward," said Liam Fedus, an OpenAI scientist involved in the development of ChatGPT.

Among the members of the ChatGPT team, five were named AI 2000 global artificial intelligence scholars in 2023. MIT Technology Review reporter Will Douglas Heaven interviewed OpenAI co-founder John Schulman, developers Agarwal and Fedus, and alignment team leader Jan Leike.

Why ChatGPT is so popular, we don't even understand why founder John Schulman said that a few days after ChatGPT's release, he used to browse Twitter from time to time. There was a crazy time when the Twitter stream was full of screenshots of ChatGPT.

He thought it was a product that was intuitive to users and that it would have some fans, but he didn't expect it to become so mainstream.

Jan Leike said it was all so sudden that everyone was surprised and tried to keep up with the pace of the ChatGPT explosion. He wondered what was driving its surge in popularity, and was there any driving force behind it? After all, OpenAI doesn't even know why ChatGPT is so popular.

Liam Fedus explained why they were so surprised, because ChatGPT is not the first general-purpose chatbot, and many people have tried it before, so Liam Fedus doesn't think they have a good chance. However, the private beta version also gave him confidence-maybe this An is something that users will really like.

Sandhini Agarwal concluded that for everyone, ChatGPT's sudden popularity was a surprise. Previously, people have done so much work on these models that they have forgotten how amazing it is to the general public outside the company.

Indeed, most of the technologies in ChatGPT are not new. It is a fine-tuned version of GPT-3.5, and OpenAI released GPT-3.5 a few months ago in ChatGPT. GPT-3.5 itself is an updated version of GPT-3, and GPT-3 appeared in 2020.

ChatGPT team participated in the previous seven major technology developers on the website, OpenAI provides these models in the form of API or API, and other developers can easily insert the models into their own code.

In January 2022, OpenAI also released InstructGPT, the previous fine-tuned version of GPT-3.5. It's just that these technologies are not introduced to the public.

Fine-tuning process according to Liam Fedus, the ChatGPT model is fine-tuned from the same language model as InstructGPT, using a similar fine-tuning method. The researchers added some dialogue data and made some adjustments to the training process. So they don't want to exaggerate it as a huge fundamental progress.

It turns out that it is the dialogue data that has played a big role in ChatGPT.

According to the assessment of the standard benchmark, in fact, there is not much difference in the original technical capability between the two models, and the biggest difference of ChatGPT is that it is easier to obtain and use.

Jan Leike explained that in a sense, ChatGPT can be understood as a version of the AI system that OpenAI has been around for some time. ChatGPT is no more capable. Before ChatGPT came out, the same basic model had been in use on API for nearly a year.

The researchers' improvement can be summed up as, in a sense, making it more in line with what humans want to do with it. It talks to users in conversations and is a chat interface that is easy to access. It is easier to infer intentions, and users can test back and forth to achieve what they want.

The secret is human feedback reinforcement learning (RLHF), which is very similar to the way InstructGPT is trained-to teach it what human users actually like.

According to Jan Leike, they asked a large group of people to read ChatGPT's tips and responses, and then chose one of the two responses to see which one they thought was better. All of these data are then merged into one training session.

Most of it is the same as what they did on InstructGPT. For example, you want it to be helpful, hope it's real, and hope it's not vicious.

There are also some details, such as if the user's question is not clear, it should ask follow-up questions to refine. It should also clarify that it is an artificial intelligence system and should not assume the identity it does not have or claim to have the capabilities it does not have. When the user asks it to do a task that it should not do, it must explicitly refuse.

That is, there is a list of various criteria that human raters must rank the model, such as authenticity. But they also prefer certain practices, such as AI not pretending to be human.

Preparing for release overall, ChatGPT uses technologies that OpenAI has already used, so the team didn't do anything special when preparing to release the model to the public. In their view, the standards set for the previous model are sufficient, and GPT-3.5 is safe enough.

In the training of human preference, ChatGPT taught itself the behavior of refusal and rejected a lot of requests.

OpenAI created some "good cop" for ChatGPT: everyone in the company sat down and tried to break the model. There are outside groups that do the same thing. Trusted early users will also provide feedback.

According to Sandhini Agarwal, they did find that it produces some unwanted output, but these are also things that GPT-3.5 produces. So, if you only look at the risks, ChatGPT is good enough as a "research preview".

John Schulman also says it's impossible to wait for a system to be 100% perfect before releasing it. Over the past few months, they have conducted beta tests on earlier versions, and beta testers have a good impression of ChatGPT.

What OpenAI is most worried about is the fact, because ChatGPT is so fond of making things up. But these problems exist in InstructGPT and other large language models, so in the eyes of researchers, as long as ChatGPT is better than those models on factual and other security issues, it is sufficient.

According to a limited assessment, before the release, it was confirmed that ChatGPT was more realistic and more secure than other models, so OpenAI decided to continue with the release.

Feedback after the release of ChatGPT, OpenAI has been watching how users use it.

For the first time in history, a large language model is in the hands of tens of millions of users.

Users are also crazy, trying to test where the limits of ChatGPT are and where is bug.

Of course, there are a lot of problems, such as ChatGPT opens the door for hackers to help steal malware code for credit card numbers, and OpenAI is constantly improving on these problems.

The popularity of ChaatGPT has also led to the emergence of many problems, such as biases, such as those induced by hackers through prompt.

Jan Leike said that some of the things that have gone viral on Twitter have already been quietly sold by someone on OpenAI.

For example, the problem of prison break is definitely something they need to solve. Users just like to try to use some bends to make the model say bad things, which is the only way to be expected by OpenAI.

When a jailbreak is found, OpenAI will add these situations to the training and test data, and all the data will be incorporated into the future model.

Jan Leike says that whenever there is a better model, they want to test it.

They are very optimistic that some targeted confrontational training can greatly improve the situation of prison break. Although it is not clear whether these problems will disappear completely, they believe they can make many prison breaks difficult.

When a system "officially debut", it is difficult to foresee all the things that will actually happen.

Therefore, they can only focus on the purpose of monitoring people's use of the system, see what happens, and then respond to it.

Today, Microsoft has launched Bing Chat, which is considered by many to be a version of OpenAI's unannounced GPT-4.

On this premise, Sandhini Agarwal says the stakes they face now are certainly much higher than they were six months ago, but still lower than they were a year later.

It is of great significance in what context these models are used.

For big companies like Google and Microsoft, even if one thing is not true, it can be a huge problem because they are search engines themselves.

Google's 23rd employee, Paul Buchheit, who founded Gmail, is pessimistic about Google as a big language model of a search engine, which is completely different from a chat robot just for fun. OpenAI researchers are also trying to figure out how to move between different uses to create something that is really useful to users.

John Schulman admits that OpenAI underestimated the level of concern about ChatGPT's political issues. To this end, when collecting training data, they hope to make some better decisions to reduce these problems.

Jan Leike says that from his point of view, ChatGPT often fails. There are too many problems that need to be solved, but OpenAI has not solved them. He admitted this frankly.

Although the language model has been around for some time, it is still in its early stages.

Next, OpenAI needs to do more.

Reference:

Https://futurism.com/the-byte/openai-confused-people-impressed-chatgpt

Https://www.technologyreview.com/2023/03/03/1069311/inside-story-oral-history-how-chatgpt-built-openai/

This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.