In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >
Share
Shulou(Shulou.com)12/24 Report--
Thank you, Mr. Air, a netizen of CTOnews.com, for your clue delivery! CTOnews.com news on December 7, AMD held a "Advancing AI" event at 2: 00 a.m. today, officially announced the flagship AI GPU accelerator MI300X, its performance is 60% higher than Nvidia's H100.
Performance: during the presentation, AMD shared the performance parameters of MI300X compared with Nvidia's H100 accelerator card. The values attached to CTOnews.com are as follows:
The memory capacity is 2.4 times that of H100.
The memory bandwidth is 1.6 times that of H100.
The accuracy of FP8 TFLOPS is 1.3 times that of H100.
The accuracy of FP16 TFLOPS is 1.3 times that of H100.
In 1v1 comparison, the training speed of Llama 270B model is 20% faster than that of H100.
In 1v1 comparison, the training speed of FlashAttention 2 model is 20% faster than that of H100.
In 8v8 Server comparison, the training speed of Llama 270B model is 40% faster than that of H100 model.
In 8v8 Server comparison, the training speed of Bloom 176B model is 60% faster than that of H100 model.
AMD mentioned that MI300X is on a par with its competitors (H100) in terms of training performance, offers competitive price / performance, and performs better in reasoning workloads.
The MI300X AI accelerator card software stack has been upgraded to ROCm 6. 0, with improved support for generative AI and large language models.
The new software stack supports the latest computing formats such as FP16, Bf16, and FP8 (including Sparsity).
Architecture: AMD Instinct MI300X is the most watched chip because it is aimed at NVIDIA's Hopper in the AI field and Intel's Gaudi accelerator.
The chip is based entirely on the CDNA 3 architecture, using a mixture of 5nm and 6nm IP,AMD to combine these IP, bringing the number of transistors to 153 billion.
In terms of design, the main intermediary layer uses a passive chip layout, which uses the fourth-generation Infinity Fabric solution to accommodate the interconnection layer. The intermediary layer consists of 28 chips, including 8 HBM3 packages, 16 virtual chips between HBM packages and 4 active chips, each of which has 2 computing chips.
Each GCD based on the CDNA 3 GPU architecture has a total of 40 computing units, equivalent to 2560 cores. There are a total of eight computing chips (GCD), so there are a total of 320computing and 20480 core units. In terms of yield, AMD will reduce a small portion of these cores, and we will see a total of 304 computing units (38 CU per GPU microchip) and a total of 19456 stream processors.
In terms of memory, the MI300X uses HBM3 memory with the highest capacity of 192GB, which is 50% higher than that of the previous MI250X (128GB). The memory will provide up to 5.3 TB / s of bandwidth and 896 GB/s of Infinity Fabric bandwidth.
AMD equips MI300X with eight HBM3 stacks, each 12-Hi, and integrates 16 Gb IC, with 2 GB capacity per IC or 24 GB per stack.
By contrast, NVIDIA's upcoming H200 AI accelerator will provide 141GB capacity, while Intel's Gaudi 3 will provide 144GB capacity.
In terms of power consumption, the rated power of the AMD Instinct MI300X is 750W, an increase of 50% over the 500W of Instinct MI250X and 50W more than the NVIDIA H200.
One of the configurations is gigabyte's G593-ZX1 / ZX2 series servers, which provide up to 8 MI300X GPU accelerators and two AMD EPYC 9004 CPU. These systems will be equipped with up to 8 3000W power supplies with a total power of 18000W.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.