In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)11/24 Report--
Thanks to CTOnews.com netizen Sancu for the clue delivery! Boss Huang has won again! In the latest MLPerf benchmark, H100 successfully set 8 test records. Foreign media revealed that the next generation of consumer-grade graphics cards may be released in 2025.
In the latest MLPerf training benchmark, H100 GPU has set a new record in all eight tests!
Today, NVIDIA H100 dominates almost all categories and is the only GPU used in the new LLM benchmark.
3584 H100 GPU groups completed a large-scale GPT-3-based benchmark in just 11 minutes.
The MLPerf LLM benchmark is based on OpenAI's GPT-3 model and contains 175 billion parameters.
Lambda Labs estimates that training such a large model requires about 3.14e+11 trillion FLOPS of computation.
The system with the highest ranking in the LLM and BERT Natural language processing (NLP) benchmark was developed jointly by NVIDIA and Inflection AI.
Hosted by CoreWeave, a cloud service provider that specializes in enterprise GPU acceleration workloads.
The system combines 3584 NVIDIA H100 accelerators and 896 Intel Xeon Platinum 8462Y + processors.
Because Nvidia introduced a new Transformer engine in the H100, it is designed to accelerate Transformer model training and reasoning, which increases the training speed by six times.
The performance provided by CoreWeave from the cloud is very close to that provided by Nvidia's AI supercomputer running from the local data center.
This benefits from the fact that the NVIDIA Quantum-2 InfiniBand network used by CoreWeave has a low latency network.
As the number of H100 GPU involved in the training expanded from hundreds to more than 3000.
Good optimization enables the entire technology stack to achieve near-linear performance expansion in demanding LLM tests.
If the number of GPU is reduced to half, the training time for the same model will be increased to 24 minutes.
It shows that the efficiency potential of the whole system is superlinear with the increase of GPU.
The main reason is that Nvidia considered this problem from the beginning of GPU design and used NVLink technology to achieve efficient communication between GPU.
Of the 90 systems tested, 82 used Nvidia's GPU for acceleration.
Single card training efficiency
The cluster training time of the system compared with Intel's system used 64 to 96 Intel Xeon Platinum 8380 processors and 256 to 389 Intel Habana Gaudi2 accelerators.
However, Intel submitted a training time of 311 minutes for GPT-3.
The results are slightly worse than those of Nvidia.
Analyst: Nvidia has too much advantage industry analysts believe that Nvidia's technological advantage on GPU is very obvious.
As an AI infrastructure provider, its leading position in the industry is also reflected in the stickiness of the ecosystem that Nvidia has built up for many years.
The AI community is also very dependent on Nvidia's software.
Almost all AI frameworks are based on the underlying CUDA libraries and tools provided by Nvidia.
It also provides full stack AI tools and solutions.
In addition to supporting AI developers, Nvidia continues to invest in enterprise-level tools for managing workloads and models.
Nvidia's leading position in the industry will be very solid in the foreseeable future.
Analysts further pointed out.
As shown in the MLPerf test results, the powerful function and efficiency of NVIDIA system for AI training in the cloud is the greatest cost of Nvidia's "Battle the Future".
The next generation of Ada Lovelace GPU,2025 released Tom's Hardware freelance writer Zhiye Liu also recently posted an article about the plans for the next generation of Nvidia Ada Lovelace graphics cards.
There is no doubt about the ability of the H100 to train large models.
A GPT-3 model can be trained in just 11 minutes with only 3584 H100s.
At a recent press conference, Nvidia shared a new roadmap detailing the next generation of products, including the successor to the GeForce RTX 40 series Ada Lovelace GPU, which is the best game graphics card today.
According to the roadmap, Nvidia plans to launch a "Ada Lovelace-Next" graphics card in 2025.
If the current naming scheme continues, the next generation of GeForce products should be available as the GeForce RTX 50 series.
According to information obtained by the South American hacker group LAPSU$, Hopper Next is likely to be named Blackwell.
On consumer-grade graphics cards, Nvidia maintains a biennial update pace.
They launched Pascal in 2016, Turing in 2018, Ampere in 2020 and Ada Lovelace in 2022.
If a successor to Ada Lovelace is launched in 2025, Nvidia will no doubt break the usual pace.
The recent outbreak of AI has created a huge demand for Nvidia GPU, whether it is the latest H100 or the previous generation A100.
According to reports, a large factory ordered $1 billion worth of Nvidia GPU this year.
Despite export restrictions, China is still one of the largest markets for Nvidia in the world.
(it is said that a small amount of Nvidia A100 is available at Huaqiang North Electronics Market in Shenzhen, which sells for $20,000 each, twice the usual price. )
In response, Nvidia has fine-tuned some AI products, releasing specific SKU such as H100 or A800 to meet export requirements.
Zhiye Liu said that from another point of view, export regulations are actually good for Nvidia, because it means that chipmaker customers have to buy more variants of the original GPU in order to get the same performance.
This makes it understandable why Nvidia gives priority to generating computational GPU rather than game GPU.
Recent reports show that Nvidia has increased its production of computational-grade GPU.
Without fierce competition from AMD's RDNA 3 product stack, and Intel does not pose a serious threat to the GPU duopoly, Nvidia can procrastinate on the consumer side.
Recently, Nvidia has expanded its GeForce RTX 40 series product stack with GeForce RTX 4060 and GeForce RTX 4060 Ti.
GeForce RTX 4050 and RTX 4080 Ti or GeForce RTX 4090 Ti at the top have potential.
If forced, Nvidia can also take out a product from the old Turing version, update Ada Lovelace, give it a "Super" treatment, and further expand the Ada lineup.
Finally, Zhiye Liu says that at least this year or next, the Lovelace architecture will not really be updated.
Reference:
Https://blogs.nvidia.com/blog/2023/06/27/generative-ai-debut-mlperf/
This article comes from the official account of Wechat: Xin Zhiyuan (ID:AI_era)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.