Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The co-founder of DeepMind says OpenAI is secretly training GPT-5, which is 100 times larger than the current model.

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

GPT-5 is still in secret training! The co-founder of DeepMind revealed in a recent interview that the Inflection model will be 1000 times larger than the current GPT-4 in the next three years.

DeepMind co-founder Mustafa Suleyman, now Inflection AI's CEO, recently dropped a blockbuster in an interview:

OpenAI is secretly training GPT-5.

I think it would be better for us to put it bluntly. That's why we disclose the total amount of calculations we have.

Over the next 18 months, the Inflection AI-trained model will be 100 times larger than the current cutting-edge model. In the next three years, Inflection's model will be 1000 times larger than it is now.

In fact, Sam Altman has previously denied the idea of training GPT-5. In response, netizens said that OpenAI may have given it a new name, which is why they said they did not train GPT-5.

Just like when Code Interpreter was launched, many people feel that its ability is no longer the GPT-4 model, but should be GPT-4.5.

In addition, in the interview, Suleyman also revealed a lot of inside information about his work at DeepMind and Inflection AI, including Google's acquisition of DeepMind and its subsequent capture, which to some extent explains why DeepMind got up early and caught a late set compared to OpenAI.

He also believes that open source models may increase the instability and harm that AI brings to human beings. The biggest threat to AI security is not the large language model, but the autonomous agents that may emerge in the future.

When asked if it is possible for AI to become an agent with the ability to evolve independently in the future, Suleyman thinks:

In the short term, it is unlikely that such an agent will be able to run on its own, set its own goals, identify new information and reward signals in the environment, and learn to use it as a self-supervision. and update their weights over time.

But this self-evolving AI is something that no one should ignore, because if some kind of AI technology really shows this ability, it may have a very big potential risk.

At least as far as he knows, neither Inflection AI nor DeepMind is moving in this direction.

Inflection AI is not an AGI company, what they want to do is to be a very useful personal assistant. This assistant provides users with highly customized AI services on the premise of full access to users' personal information.

Will the model training arms race increase the risk of AI? His company, Inflection AI, is building one of the world's largest supercomputers, and he thinks they may run a training run that is 10 or 100 times larger than the language model training run that makes GPT-4 in the next 18 months.

When asked whether this arms race-style training model might increase the risk of AI, he replied:

100x training still produces a chatbot, which can be understood as a better GPT-4, and although it will become a more impressive model, it is not dangerous-because it lacks autonomy and cannot transform the basic elements that make the model dangerous, such as the physical world.

Just producing a very good and better GPT-4 is not dangerous; in order to make it dangerous, we need to add other capabilities, such as the aforementioned ability to enable the model to iterate itself, set goals, and so on.

That was about five, ten, fifteen, twenty years later.

Suleyman believes that Sam Altman recently said that they did not train GPT-5 and may not be telling the truth. (Come on. I don't know. I think it's better that we're all just straight about it.)

He wants all companies with large-scale computing power to be as transparent as possible, which is why they disclose the total amount of computing they have.

They are training bigger models than GPT-4. Currently, they have 6000 H100s in the process of training models.

By December, 22000 H100s will be fully operational. From now on, there will be an increase of 1000 to 2000 H100s every month.

He thinks Google DeepMind should do the same and disclose how much FLOPS training Gemini has received.

How will the cost of AI training change? from the perspective of arithmetic cost, the scale of AI training in the future is unlikely to reach the cost of $10 billion to train a model, unless someone really will spend three years training a model, because the more arithmetic power you stack to train a larger model, the longer it will take.

Although the higher the cost, the stronger the ability may be, but this is not an unlimited mathematical problem, and many practical limitations need to be considered.

But because the cost of computing is falling with the iteration of the computing power of the chip, it may be that the cost of training a model in the future is equivalent to spending $10 billion on training in 2022.

But because the computing power of the chip will increase by 2-3 times the efficiency, the cost of training a chip on this scale will be much less than it seems.

For example, models such as Llama2 or Falcon in the open source community now have the ability of GPT-3 parameters with 175 billion parameters with only 1.5 billion parameters or 2 billion parameters.

The open source view, as Suleyman, which has been working in a closed source technology company, has a very different view on the value and possible risks of the open source model.

First of all, he believes that in the time dimension of the next five years, the open source model will always lag behind the cutting edge closed source model by 3-5 years. Moreover, the open source model will increase the social risk brought by AI.

If everyone has unlimited access to the latest model, there will be a phenomenon-"rapid spread of power".

Take Chestnut, for example, like a new media platform that allows everyone to play a role as a complete newspaper, with millions of fans, and even influence the world.

Unrestricted access to cutting-edge models will expand this power because over the next three years, humans will be able to train models 1000 times larger than existing models.

Even Inflection AI can gain 100 times more computing power over the next 18 months than current cutting-edge models.

The big open source model puts this power in everyone's hands, giving everyone a potentially large-scale unstable and destructive tool.

At that time, to find ways to avoid the possible destructive consequences of these tools, someone made a very clever analogy-the picture tried to stop the rain by holding Rain Water with his hand.

He has explained to regulators that AI technology will lower the threshold for the development of many potentially dangerous compounds or weapons in the future.

AI can provide a lot of help in actually making these things-such as telling you where to get tools when you encounter technical challenges in the lab. However, it is true that removing these contents from pre-training, aligning models, and so on, can effectively reduce this risk.

In short, for people who use the ability of large models to do bad things, they need to make it difficult to do these things as much as possible.

However, if all models are open source as much as possible, more and more similar risks will be exposed in the face of more and more capable models in the future.

So although the open source model is really a good thing for many people, it allows everyone to get the model and make a variety of attempts to bring technological innovation and improvement, but it is also important to see the risks of open source. Because not everyone is kind and friendly.

Although what I say may be understood by many people as a conflict of interest between what I am doing and the open source community, so many people may be angry, I still want to express my views.

He also stressed that he did not make these remarks to attack the open source community:

"although what I say may be understood by many people as a conflict of interest between what I do and the open source community, so many people may be angry, I still want to express my views and hope to get people's support. "

During his 10 years at DeepMind during Google and DeepMind, he spent a lot of time trying to integrate more external oversight into the process of building AI technology.

This is a very painful process. Although he thinks Google has a good starting point, it still operates like a traditional bureaucracy.

When we set up Google's ethics committee, we planned to have nine independent members, which was an important measure of external oversight in the development of sensitive technologies.

But because of the appointment of a conservative, and she has made some controversial comments in the past, many netizens boycotted her on Twitter and other occasions, as well as several other members who supported her, demanding that they withdraw from the committee.

This is a complete tragedy, very frustrating. It took us two years to set up this committee, which is the first step towards an external review of the very sensitive technology we are developing.

Unfortunately, within a week, three of the nine members resigned, and eventually she resigned, and then we lost half of the committee members.

Then the company turned around and said, "Why should we hire people to limit ourselves?" This is a waste of time. "

In fact, when DeepMind was acquired, we put forward the conditions of the acquisition, that is, we should have an ethics and safety committee.

We plan to build DeepMind into a global interest company after the Ethics and Safety Committee: a company where all stakeholders have a voice in making decisions.

It is a company set up as a guarantee limited liability company. Then we plan to develop a charter that sets relevant ethical security goals for the development of AGI; this allows us to spend most of our income on scientific and social missions.

This is a very creative and experimental structure. But when Alphabet saw what happened to set up an ethics committee, they became timid. They said, "this is completely crazy. The same thing will happen to your global interest companies. Why would you do that? "

In the end, we merged DeepMind into Google, and to some extent, DeepMind was never independent-and now, of course, completely subordinate to Google.

Google's next generation of big model GeminiThe Information exclusively reported that Google's multimodal artificial intelligence model Gemini will be available soon, directly targeting OpenAI's GPT-4.

In fact, at this year's Google I / O conference, firewood chopping announced to the public that Google is working on the next generation model Gemini.

It is rumored that the model will have at least 1 trillion parameters and the training will use tens of thousands of Google TPU AI chips.

Similar to OpenAI, Google uses GPT-4 's method to build models, which are made up of several expert models of artificial intelligence with specific capabilities.

In short, Gemini is also a hybrid expert model (MoE).

This may also mean that Google wants to provide Gemini with different parameter sizes, because it is a good choice in terms of cost-effectiveness.

In addition to generating images and text, Gemini is trained in YouTube video transcription data and can also generate simple videos, similar to RunwayML Gen-2.

In addition, compared with Bard,Gemini, the coding ability has also been significantly improved.

After the launch of Gemini, Google also plans to gradually integrate it into its own product line, including upgrading Bard, Google Office Family Bucket, Google Cloud and so on.

In fact, before Gemini, DeepMind also had a model code-named "Goodall" based on the unannounced model Chipmunk, which could rival ChatGPT.

However, after the birth of GPT-4, Google finally decided to abandon the development of this model.

It is said that at least 20 executives were involved in the development of Gemini, led by Demis Hassabis, the founder of DeepMind, and Sergey Brin, the founder of Google.

There are also hundreds of employees of Google DeepMind, including Jeff Dean, the former head of Google's brain.

Demis Hassabis said in a previous interview that Gemini will combine some of the advantages of the AlphaGo type system with the amazing language capabilities of large models.

You can see that Google has been fully prepared for the war, waiting for Gemini to open the road to counterattack.

Reference:

Https://80000hours.org/podcast/episodes/mustafa-suleyman-getting-washington-and-silicon-valley-to-tame-ai/

Https://twitter.com/AISafetyMemes/status/1697960264740606331

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 289

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report