AMD ace accelerator card MI300X comes out: training AI model is up to 60% faster than Nvidia H100 10/31 Update SLTechnology News&Howtos

AMD ace accelerator card MI300X comes out: training AI model is up to 60% faster than Nvidia H100

2025-10-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)12/24 Report--

Thank you, Mr. Air, a netizen of CTOnews.com, for your clue delivery! CTOnews.com news on December 7, AMD held a "Advancing AI" event at 2: 00 a.m. today, officially announced the flagship AI GPU accelerator MI300X, its performance is 60% higher than Nvidia's H100.

Performance: during the presentation, AMD shared the performance parameters of MI300X compared with Nvidia's H100 accelerator card. The values attached to CTOnews.com are as follows:

The memory capacity is 2.4 times that of H100.

The memory bandwidth is 1.6 times that of H100.

The accuracy of FP8 TFLOPS is 1.3 times that of H100.

The accuracy of FP16 TFLOPS is 1.3 times that of H100.

In 1v1 comparison, the training speed of Llama 270B model is 20% faster than that of H100.

In 1v1 comparison, the training speed of FlashAttention 2 model is 20% faster than that of H100.

In 8v8 Server comparison, the training speed of Llama 270B model is 40% faster than that of H100 model.

In 8v8 Server comparison, the training speed of Bloom 176B model is 60% faster than that of H100 model.

AMD mentioned that MI300X is on a par with its competitors (H100) in terms of training performance, offers competitive price / performance, and performs better in reasoning workloads.

The MI300X AI accelerator card software stack has been upgraded to ROCm 6. 0, with improved support for generative AI and large language models.

The new software stack supports the latest computing formats such as FP16, Bf16, and FP8 (including Sparsity).

Architecture: AMD Instinct MI300X is the most watched chip because it is aimed at NVIDIA's Hopper in the AI field and Intel's Gaudi accelerator.

The chip is based entirely on the CDNA 3 architecture, using a mixture of 5nm and 6nm IP,AMD to combine these IP, bringing the number of transistors to 153 billion.

In terms of design, the main intermediary layer uses a passive chip layout, which uses the fourth-generation Infinity Fabric solution to accommodate the interconnection layer. The intermediary layer consists of 28 chips, including 8 HBM3 packages, 16 virtual chips between HBM packages and 4 active chips, each of which has 2 computing chips.

Each GCD based on the CDNA 3 GPU architecture has a total of 40 computing units, equivalent to 2560 cores. There are a total of eight computing chips (GCD), so there are a total of 320computing and 20480 core units. In terms of yield, AMD will reduce a small portion of these cores, and we will see a total of 304 computing units (38 CU per GPU microchip) and a total of 19456 stream processors.

In terms of memory, the MI300X uses HBM3 memory with the highest capacity of 192GB, which is 50% higher than that of the previous MI250X (128GB). The memory will provide up to 5.3 TB / s of bandwidth and 896 GB/s of Infinity Fabric bandwidth.

AMD equips MI300X with eight HBM3 stacks, each 12-Hi, and integrates 16 Gb IC, with 2 GB capacity per IC or 24 GB per stack.

By contrast, NVIDIA's upcoming H200 AI accelerator will provide 141GB capacity, while Intel's Gaudi 3 will provide 144GB capacity.

In terms of power consumption, the rated power of the AMD Instinct MI300X is 750W, an increase of 50% over the 500W of Instinct MI250X and 50W more than the NVIDIA H200.

One of the configurations is gigabyte's G593-ZX1 / ZX2 series servers, which provide up to 8 MI300X GPU accelerators and two AMD EPYC 9004 CPU. These systems will be equipped with up to 8 3000W power supplies with a total power of 18000W.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Weibo

Tencent

Renren

QQZone

Douban

Weibo

Tencent

Renren

QQZone

Douban

Yixin

The market share of Chrome browser on the desktop has exceeded 70%

The market share of Chrome browser on the desktop has exceeded 70%, and users are complaining about

2025-09-03 14:52:50 SL Technology News Views: 8
The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.

The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.According to a r

2025-09-03 14:07:30 SL Technology News Views: 17
Disney Agrees to Pay $10 Million to Settle with FTC over Alleged Child Data Collection Using YouTube Animations

On September 3, it was reported that Disney has agreed to pay $10 million to settle a case in which

2025-09-03 14:03:30 SL Technology News Views: 9
Google Wins! Court Rules It Doesn't Have to Sell Chrome Browser

A US federal judge has ruled that Google can keep its Chrome browser, but it will be prohibited from

2025-09-03 13:41:31 SL Technology News Views: 6
Build zoopker+hbase environment

Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope

2023-12-25 21:17:29 shulou Views: 367

IT Information

More IT Information >

AMD ace accelerator card MI300X comes out: training AI model is up to 60% faster than Nvidia H100

Related

The market share of Chrome browser on the desktop has exceeded 70%

The world's first 2nm mobile chip: Samsung Exynos 2600 is ready for mass production.

Disney Agrees to Pay $10 Million to Settle with FTC over Alleged Child Data Collection Using YouTube Animations

Google Wins! Court Rules It Doesn't Have to Sell Chrome Browser

Build zoopker+hbase environment

IT Information

Leiber Technology releases M + Wireless Cross-screen Technology: wireless Cross-screen Transmission of data

MOSS prototype of wandering Earth, "made in China" quantum computer will be open to the public for the first time free of charge.

Huawei Li Peng: 6GHz technical verification has been completed in conjunction with operators, which can achieve 10Gbps downlink rate.

How can agricultural enterprises make good use of digital "new agricultural tools"? Mei set a good example!

No data problems in Apple's iPhone Weather Application Service affect users around the world

Latest Network Security More Network Security >

Latest Internet Technology More Internet Technology >

Latest Development More Development >

Latest Database More Database >

Latest Servers More Servers >

Latest Mobile Phone More Mobile Phone >

Latest Android Software More Android Software >

Latest Apple Software More Apple Software >

Latest Computer Software News More Computer Software News >

Latest IT Information More IT Information >