Talking about GPU Virtualization Technology (4)-GPU fragment Virtualization 04/26 Update SLTechnology News&Howtos

Talking about GPU Virtualization Technology (4)-GPU fragment Virtualization

2025-04-26 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Thank you for waiting, Ali waiter will begin to serve a new dish: "GPU fragmentation Virtualization".

I believe we are no stranger to the understanding of "slicing". The fragmentation here is defined from two dimensions: one is the division of GPU in time fragments, which is similar to the process scheduling of CPU. The computing engine of a physical GPU is shared among several vGPU, while the scheduling time slice is generally around 1ms-10ms. The second is the division of GPU resources, which mainly refers to the division of GPU video memory. Take NVIDIA as an example, a physical GPU with 16GB memory is divided according to 16 vGPU. Each vGPU gets the 1GB's video memory. Due to the requirement of security isolation, the video memory allocated to each vGPU is exclusive and will not be shared with other vGPU.

Technically speaking, GPU fragmentation virtualization refers to the GPU virtualization scheme based on VFIO mediated passthrough framework. The scheme was proposed by NVIDIA and submitted to the Linux kernel 4.10 code base together with Intel. The kernel code of the scheme is referred to as the mdev module. Then the latest release of Redhat Enterprise,centos took a lot of effort to backporting to 3.10.x kernel. So if you use the latest Redhat distribution (Enterprise Edition or centos7.x) and so on, they already have their own mdev module. If you use a later version of ubuntu17.x, not only does it have its own mdev function, but even the Intel GPU driver (i915) has been updated to support vGPU. You can directly experience the vGPU virtual machine functionality without any code compilation.

So what is mediated passthrough? What is the difference between it and pass through? Explanation: passthrough the access that will affect the performance directly to the virtual machine, intercept the performance-independent, functional MMIO access and simulate it in the mdev module. Things that are too detailed can be found in the following chapters.

GPU sharding Virtualization Framework

The scheme of GPU fragmentation virtualization is adopted by two GPU manufacturers, NVIDIA and Intel. NVIDIA GRID vGPU series and Intel's GVT-g (XenGT or KVMGT).

Of course, kernel support is not enough, you need to add qemu v2.0 later, plus the GPU mdev driver that comes with Intel or NVIDIA (that is, the simulation of GPU MMIO access), then the whole path of GPU sharding virtualization is complete. Whether GPU manufacturers' mdev drivers are open source or not depends on themselves. In its usual style, Intel has opened up most of its code, including the latest mdev-based GPU hot migration technology, while NVIDIA has kept it private.

GPU sharding virtualization looks like the following figure (take KVMGT as an example):

(photo Source: https://01.org/sites/default/files/documentation/an_introduction_to_intel_GVT-g_for_external.pdf)

You can see from the figure above that the simulation of vGPU is done through kvmGT (Intel) or NVIDIA-vgpu-vfio (NVIDIA). This module only simulates access to MMIO, that is, functional GPU registers that do not affect performance. On the other hand, GPU aperture and GPU graphic memory are directly mapped to the inside of VM through passthrough of VFIO.

It is worth noting that the general Passthrough method relies on IOMMU to complete the GPA-to-HPA address translation, while the sharding virtualization of GPU does not rely on IOMMU at all, that is, the cmd submission of its vGPU (including GPA addresses) cannot run directly on GPU hardware, at least a GPA-to-HPA translation process is required. This process can be repaired by cmd scanning on the host side (KVMGT). Each context in NVIDIA GRID vGPU has its own internal page table, which is achieved by modifying the page table.

Since the NVIDIA GRID vGPU code is closed, we will focus on Intel's GVT-g scheme.

Introduction of Intel GVT-g

Speaking of GVT-g, I can probably talk about it for three days and nights. Of course, you may not want to hear it here. Pick up and say succinctly:

Kernel and mdev driver source code:

Https://github.com/intel/GVT-linux

Qemu:

Https://github.com/intel/IGVTg-qemu

Setup documentation:

Https://github.com/intel/GVT-linux/wiki/GVTg_Setup_Guide

We can run the GVT-g virtualization solution on any machine with an integrated graphics card Intel SKL/BDW. GVT-g 's GPU virtualization scheme is also used in embedded, vehicle systems and other fields (ARCN hypervisor).

Practical information has come to J for those who want to understand the operation of GPU, as well as software and hardware specifications, Intel has actually opened up most of its standards.

Https://01.org/linuxgraphics/documentation/hardware-specification-prms

Take a screenshot, and for those who want to understand the design and operation of parts of GPU, just looking at this list will be inexplicably exciting.

Because GVT-g is an integrated graphics card based on Intel, the hardware requirements for the running environment are very low. Any Intel ATOM,Mobile Core with GPU or Xeon E3 and other CPU can support vGPU virtualization (HSW,BDW,SKL series CPU).

At the same time, GVT-g is completely free, and users do not need to spend extra money to support vGPU applications.

It is these advantages that make GVT-g can be widely used in any scenario where there are virtualization and display requirements for terminals. Such as XenClient, such as ARCN and so on.

One of the advantages of GVT-g is its good support for its native display.

GVT-g internally virtualizes a display pipeline-like component to take over the display connected to the GPU display port. So the information of the framebuffer inside the vGPU can be quickly displayed by the GVT-g on the display connected to the physical GPU. It shows that FPS can reach the amazing 60FPS. It completely achieves the effect of physical display. What is more powerful is that vGPU can support multi-screen display in the virtual machine through the simulation of these port and EDID, and its display effect is inseparable from that of the physical machine.

The transmission path of its framebuffer can be described as zigzag. But it worked out pretty well. 60fps is fine.

Embed a video to describe how the two VM share the same physical monitor and can switch smoothly:

Https://01.org/sites/default/files/downloads/iGVT-g/iGVT-g-demokvmgt.zip

Media transcoding capabilities of GVT-g

Intel GPU's hardware support for media decoding/encoding is one of its major features. GVT-g also adds support for media decoding/encoding in the process of vGPU virtualization. The codec throughput of its virtualized vGPU can reach an astonishing 99% of physical GPU throughput (HSW GPU 2014). Relying on the advantage of 99% physical performance of vGPU media transcoding in that year, the GVT-g team put forward Intel GVT-g 's vision for the future of media cloud at the IDF held in Shenzhen that year. And made a booth for GVT-g to Media Cloud in conjunction with Huawei at the Mobile World Congress in Barcelona in 2015. Its architecture is envisioned as follows (the green square in the picture is the starting point of Media Cloud, and the screenshot is from the official website of GVTg)

Https://01.org/sites/default/files/documentation/intel_graphics_virtualization_for_media_cloud.pdf

Subsequently, because the design of Intel GPU software and hardware designers in the next generation GPU did not fully consider the sharding virtualization scenario, it destroyed the advantage of GVT-g in media transcoding to some extent. At present, the efficiency of vGPU coding and decoding on BDW and SKL has been unsatisfactory and has lost its advantage.

I have to sigh that I once dreamed of going to the end of the world with a sword. . It's cool now.

Rendering ability of GVT-g Technology

Extract data directly from Intel GVT's official website (https://01.org/sites/default/files/documentation/an_introduction_to_intel_GVT-g_for_external.pdf))

Basically, the capability of vGPU for Graphic rendering is more than 80% of that of physical GPU, generally about 90%. Recall that the rendering capability of vGPU under GPU virtualization of AMD's SRIOV type introduced in Chapter 3 can reach about 97%. At the same time, the physical rendering ability of Intel GPU is far at a disadvantage compared with AMD/NVIDIA 's contemporary GPU. Therefore, for vGPU applications that emphasize 3D rendering of scenes with high computing power, the application of Intel GPU is relatively limited.

From a technical point of view, the main cost of GVT-g 's performance loss to vGPU lies in the simulation of its interrupt-related MMIO. For example, for AMD's SRIOV scheme, there is no virtualization overhead for MMIO access to vGPU in VM, and there will be no trap. Even if the SRIOV solution is not adopted, generally speaking, hardware design takes into account the requirements of virtualization and makes changes in favor of the virtualization framework. MMIO, which is performance-sensitive related to interrupts like this, needs to be specially designed to reduce wastage under virtualization. Hardware designs like Intel GPU, which do not consider virtualization overhead at all, make GVT-g unable to reach the level of potential competitors at the software level no matter how optimized. Why is it a potential opponent? Because for NVIDIA, GVT-g and Intel GPU are not rivals at all.

GPGPU capabilities of GVT-g

GVT-g vGPU only makes it possible to run OpenCL without any optimization for performance and so on. Intel GPU hardware is not a strong point in terms of computing and deep learning.

Live Migration of GVT-g

GVT-g 's vGPU is the ultimate at the software level. As early as the end of 2015, it began to support the thermal migration of vGPU. It will be announced in 2016. GRID vGPU has only recently revealed that it supports hot migration of vGPU on some Citrix products and supports only some GPU models. However, there is no public information about hot migration in AMD's SRIOV scheme so far.

The details of vGPU's hot migration are too technical to cover here, but the effect of the first real-time migration of VM rendering that supports vGPU is impressive. The video is based on the vGPU migration process of KVM.

As can be seen from the video screenshot, all the migration processes were completed in less than 1 second and displayed on the new machine. (the actual overall migration time of Demo is about 300ms)

Https://www.youtube.com/watch?v=y2SkU5JODIY

Scheduling of GVT-g

To be honest, GVT-g 's scheduling is not as good as AMD SRIOV vGPU's. Although the scheduling granularity is scheduling and vGPU switching on the dimension of 1ms time slice. However, because the support of GPU software and hardware to preempt function is not complete, the actual scheduling often needs to wait until the end of the current vGPU task to start. In the case of rendering large frames, the general statistics of vGPU scheduling information show that it basically does vGPU switching at an interval of about 5-10ms. Recall that AMD's SRIOV is strict 6ms once. The GRID vGPU of NVIDIA has a variety of scheduling strategies, because the source is closed, there is no more information to get its scheduling information. Interested readers can do some research through rebuild NVIDIA Guest Linux driver under the virtual machine.

Limitations of GVT-g

Sadly, of course, because GVT-g is an integrated graphics card based on Intel, even with free additional value-added services, GVT-g is rarely adopted in data centers. First, the GPU performance of the first cannot compete with similar AMD/NVIDIA products, and the second data center pursues high-density use. Integrated graphics cards cannot achieve multiple cards in one machine anyway. In the computer room rack, where there is an inch of land, everyone will consider the cost. Intel has also made some attempts to put several Xeon E3 CPU on a PCIE board to increase its computing density, but its power consumption and price can not compete with other competitors. At the same time, this design also makes the software system complex and difficult to maintain.

Introduction of NVIDIA GRID vGPU

Closed source system, there is nothing to introduce. It is worth mentioning that NVIDIA GRID vGPU is a formal commercial solution. Its technology has been tested in large factories such as VMWare,XenServer,Redhat.

The Application of GRID vGPU in VDI

The application of GRID vGPU in VDI is earlier than the SRIOV of GVT-g and AMD. In the early days, GRID has worked with VMWare to pile up a series of remote display solutions. GRID vGPU is good at remote display,GVT-g, good at local display, each has its own advantages.

GRID vGPU rendering capability

Compared with GVT-g,GRID vGPU, the virtualization loss in graphics rendering is very small, almost similar to that of AMD SRIOV, and can reach about 99% of its passthrough state. The GVT-g is around 90%.

General computing power of GRID vGPU

Although not many people will do deep learning computing within a fragmented GPU virtualized VM, the computing performance of GRID vGPU has reached more than 80% of its passthrough state. Under the condition that its current vGPU is fragmented at 1:1 (a physical GPU is divided into only one vGPU), the performance indicators are almost comparable to passthrough GPU's scheme. It can completely replace the scheme of GPU passthrough. However, the support for multi-vGPU is currently not supported by GRID vGPU. This is its biggest disadvantage compared with GPU passthrough scheme. Obviously, NVIDIA will not sit idly by. GRID vGPU will certainly consider multi-vGPU scenarios and support P2P in the future. Another advantage of GRID vGPU over passthrough GPU is that it can monitor the key performance indicators of vGPU on the Host side. Remember what we mentioned in the second chapter of this series when we introduced the GPU passthrough scheme: what are the inherent shortcomings of the GPU passthrough approach? in the case of GPU passthrough, the host side cannot effectively monitor vGPU, which is not a problem in the GRID vGPU scenario.

GRID vGPU fragmentation virtualization solution is more difficult to deploy than GPU passthrough, because closed source, it is not like Intel GVT-g everything open source and integrated into the kernel code base, general manufacturers need to use this technology but also have to do varying degrees of kernel adaptation and debugging. The release cycle is at least half a year or more.

Similarities and differences of some implementation details of various GPU virtualization schemes

Mediated passthrough (mdev)

Let's review what mediated passthrough is once again. You should first look at kernel document:

Https://github.com/torvalds/linux/blob/master/Documentation/vfio-mediated-device.txt

As mentioned earlier, Mediated refers to the interception of MMIO access and emulation, and the GFN-to-PFN address translation for the submission of DMA transfer.

NVIDIA has described these details in great detail on the 2016 KVM forum.

Http://www.linux-kvm.org/images/5/59/02x03-Neo_Jia_and_Kirti_Wankhede-vGPU_on_KVM-A_VFIO_based_Framework.pdf

How GPU command is submitted

The three GPU virtualization solutions are fundamentally different in the way GPU command (batch buffer) is submitted. GVT-g and GRID vGPU are representatives of sharding virtualization, and any cmd submission of vGPU will be intercepted to host as emulation. After the processing of host, host instead of vGPU is submitted to the physical GPU. AMD SRIOV scheme is essentially a kind of GPU sharding virtualization, and the difference between it and mdev is that it is implemented through the standard of SRIOV or through mdev software, while the emulation of SRIOV vGPU is completed in the host side of GPU hardware and Firmware,GIM drivers.

In GPU passthrough mode, the cmd of vGPU is submitted directly by the virtual machine. There is no need to detour the host end. Therefore, it is impossible to monitor the operation of the vGPU inside the virtual machine under the passthrough.

To put it simply: GVT-g and GRID vGPU are submitted in one category, SRIOV and GPU passthrough in another.

IOMMU

Sharding virtualization does not require IOMMU hardware support. It only needs the VFIO module to add the driver of type1 IOMMU to inform host of the GFN,VA and other information that will be transmitted by DMA, and to complete the translation from GFN to PFN in the mdev device driver layer on the host side. It can be understood as IOMMU operation at the software level.

In AMD SRIOV and GPU passthrough mode, IOMMU is a necessary component. In particular, the IOMMU hardware completes the address translation from GFN to PFN.

In short, GVT-g,GRID vGPU is a gang, SRIOV,GPU passthrough is a gang.

At this point, the series of articles on GPU Virtualization Technology is over. Thank you for reading. Enjoy waiting for the future "in-depth GPU Virtualization Technology" series … . Ha

Author: Zheng Xiao, long Xin, flexible Computing heterogeneous Computing Project Group

Last article in this series: http://cloud.it168.com/a2018/0520/3204/000003204070.shtml

For more information, please follow teacher Zheng Xiao's personal blog.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.