Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Intel CPU instruction set and operation acceleration

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/03 Report--

Frontier

Artificial intelligence operation analysis depends on the computing power of CPU and graphics card. Query the instruction set supported by CPU. Currently, the AVX2 instruction set is used for acceleration. How to query whether it supports it?

Tool download

Https://www.cpuid.com/softwares/cpu-z.html

AMD Ryzen Threadripper 3000 preliminary support

Intel Ice-Lake preliminary support

NVIDIA RTX 2070 and 2080 Super

Download ZIP.CHINESE

Https://www.cpuid.com/downloads/cpu-z/cpu-z_1.90-cn.zip

Query instruction cat / proc/cpuinfo of Linux

Knowledge reserve

AVX2 instruction set: an enhanced version of AVX instruction set

Friends who don't know about the instruction set and the AVX instruction set can take a look at "what is AVX?" AVX instruction set Technology and Application Analysis. In that year, AVX added 256bit vector width, enhanced data sorting, 3ax 4 operands, misaligned memory access and VEX encoding to the SSE4 instruction set, and the most direct benefit was a two-fold improvement in floating-point performance.

New content of AVX2

AVX2, which was released in 2011, adds the following new content:

-Integer SIMD instruction is extended to 256 bits

-2 new FMA (fusion multiplication plus, that is, ±(a × b) ±c) units and floating point FMA instructions

-discrete data loading instructions "gather", new displacement and broadcast instructions

Theoretical performance: doubling integer and floating point performance

The most important of these is the support for 256-bit integer SIMD, and 60 256-bit floating-point SIMD instructions are added to improve the 256-bit extension starting from AVX, which can theoretically improve the speed of integer and floating-point operations again.

New FMA units and instructions bring better performance

FMA is related to floating-point computing power. There are two new FMA units in the Haswell architecture (Intel's FMA3 instruction). Each FMA unit supports 8 single-precision or 4 double-precision floating-point numbers, and each periodic single / double-precision FLOPs is twice as high as AVX.

FMA has 20 instruction forms and is combined with 3 operands to form 60 new instructions, which provides great flexibility for selecting memory operands or destination operands. In addition, the fusion multiplication and addition will automatically select the calculation process of the polynomial, which reduces the delay.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report