In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
[guide to Xin Zhiyuan] Tesla launched 10000 H100 GPU clusters to speed up the landing of FSD V12. Tesla, who has his own super-calculation and H100 cluster, has officially entered the arms race!
According to Tesla whistleblower Sawyer Merritt, Tesla will launch a much-anticipated supercomputer made up of 10,000 H100 chips on Monday, US time.
This GPU cluster will be used to train various AI applications, including Tesla's FSD autopilot system.
This GPU cluster consists of 10, 000 Nvidia H100 GPU, providing peak computing power of 340 FP64 PFLOPS and 39.58 INT8 ExaFLOPS for AI.
This peak of computing power exceeds what the previously fourth-largest supercomputing Leonardo in the world can provide.
With this supercomputer, Tesla can quickly train and update its fully autonomous driving (FSD) technology.
This H100 cluster will not only make Tesla more competitive than other automakers, but will also give Tesla an exaggerated arithmetic reserve.
Boss Ma even tweeted last month: "frankly speaking." If Nvidia can provide enough GPU, we may not need Dojo. "
What is the use of Tesla's arithmetic reserve? And for Tesla, this H100 cluster not only represents unparalleled computing power, but also enables Tesla, who has huge amounts of data, to realize the data.
Tim Zaman, engineering and technical director of Tesla, tweeted that the H100 cluster launched by Tesla will be used for training video data.
Tesla has probably the largest training data set in the world, and the hot connection cache (hot tier cache) is larger than 200PB, which is several orders of magnitude higher than the data size of the large language model!
At the same time, he also said that Tesla really has these GPU clusters and computing power "physically". When many other companies claim how much power they "have", they can only "rent" it.
For the current Tesla, the significance of the online H100 cluster lies in that it can greatly accelerate the launch speed of the latest version of FSD V12.
Musk said two months ago that the latest version of FSD V12 update will no longer be a "beta" autopilot technology, suggesting that this update may indeed bring full autopilot technology.
Just a few days ago, Musk himself drove Tesla to a live demonstration of FSD V12, which was followed by the whole network. In the live broadcast, the new version of FSD V12 shows "silky self-driving performance" and an excellent driving experience.
The technical principle behind FSD V12 is to train a large number of real-time videos of excellent drivers through neural networks into a new self-driving AI to drive cars.
After Tesla launches this H100 cluster, it will greatly speed up the training speed of FSD V12, which is also confirmed by Tesla's engineering director's post above.
Musk tweeted that V12 may be updated in less than half a year!
In addition to Tesla FSD V12, Tesla's humanoid robot Optimus will also benefit from Tesla's huge reserves of computing power.
Netizens analyzed that because the working principle of intelligent robots is essentially to understand the world around them from video signals. This is the same as autopilot in nature, except for differences in shape and control.
What about Tesla's super Dojo? When Tesla launches the H100 GPU cluster, it is also activating its own R & D and manufacturing supercomputing Dojo. The following figure shows Tesla's internal forecast of Dojo computing power.
Dojo's computing power is also expected to reach 100 exaflops in October 2024.
Tesla announced his overcalculation, Dojo, for the first time on AI Day in 2021.
Nearly two years have passed, and in July this year, the Twitter (now X) technology disclosure account Whole Mars Catalog revealed that Dojo has officially started working.
The news was also confirmed by Musk's own likes.
The whole supercomputation consists of such a calculation module:
There are 25 Soc on each module, and the modules are connected by high-speed broadband.
Then the module is fixed with the host box, and all the interfaces are integrated on the system tray.
Then install two system trays with host components into one Dojo cabinet.
The illustration at the top shows the load of each Soc.
Now, with Dojo and 10,000 H100 clusters, Tesla has officially joined the computing arms race.
Reference:
Https://www.tomshardware.com/news/teslas-dollar300-million-ai-cluster-is-going-live-today
Https://twitter.com/SawyerMerritt/status/1696011140508045660
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.