Apple releases 29-minute video: details of GPU technology for A17 Pro and M3 series chips 02/13 Update SLTechnology News&Howtos

Apple releases 29-minute video: details of GPU technology for A17 Pro and M3 series chips

2026-02-13 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

CTOnews.com, November 10, Apple recently released nearly half an hour of developer video, detailing many technical details of Apple's M3 series chips and A17 Pro chips, and explaining the principle of improvement through basic terminology.

The video shows that developers can see performance improvements in M3 and A17 Pro without changing existing application code during the process of building applications using Metal API. These chipsets use dynamic caching (Dynamic Caching), hardware-accelerated ray tracing and hardware-accelerated grid mapping to greatly improve GPU performance. CTOnews.com attached the video as follows:

Dynamic cache Apple introduces next-generation shader cores into M3 and A17 Pro. When applications call GPU cores, these shaders can run more efficiently and greatly improve output performance.

In general, GPU can only allocate register memory based on the highest bandwidth process performing an operation during an operation. Therefore, if one part of the operation requires much more register memory than the others, the operation will use more register memory for a given process.

Dynamic caching allows GPU to allocate the right amount of register memory for each operation it performs, freeing register memory that was previously unavailable, and allowing more shader tasks to be performed in parallel.

Before flexible memory on chip (on-chip memory), memory on chip allocated fixed memory to registers, thread groups, and slice memory with buffer caches. This means that if an operation uses more memory of one type than another, then most of the memory will be idle.

Apple adjusts so that all on-chip memories can be used for memory-type caches. Operations that rely heavily on thread group memory can take advantage of the entire span of on-chip memory and even overflow operations into main memory.

The shader kernel can dynamically adjust the on-chip memory usage to maximize performance, which means reducing the developer's application optimization time.

The high-performance ALU pipeline at the core of the shader Apple recommends that developers perform FP16 math operations in their programs, but high-performance ALU executes different combinations of integers, FP32, and FP16 in parallel.

Instructions are executed in different operations executed in parallel, which means that ALU utilization increases as occupancy increases.

If different operations contain the same FP32 or FP16 instructions that will be executed at different points in time, they can overlap to increase parallelism.

Hardware acceleration Graphics Pipeline hardware acceleration greatly speeds up the RayTracing process, and important cross calculations are removed from the GPU function. Because the hardware is responsible for part of the calculation, it allows more operations to be performed in parallel, thus speeding up the raytracing speed of the hardware components.

Hardware accelerated mesh shading uses a similar method. It uses geometry to calculate the middle of the pipe and passes it to a dedicated unit, thus allowing more parallel operations.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.