In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/02 Report--
1. Brief introduction of QEMU and its relationship with KVM and other virtualization
QEMU, which stands for "Quick Emulator", is an open source virtualization software written in C language. The purpose of this article is to describe my understanding of the technical architecture of QEMU, and to draw lessons from it. As we all know, the source code development documents of QEMU are very rare, and the documents describing the internal structure and working mechanism are even rare. if the average developer wants to engage in the development of QEMU, they can only start from the source code. Therefore, understanding QEMU is a daunting task for technicians.
QEMU has several virtualization modes. First, it can use a kernel-based virtual machine (KVM) to perform x86 processor hardware virtualization, performing computing tasks almost as fast as hardware natives. Second, it can simulate other processors for virtual machines to run operating systems on different platforms through real-time conversion of machine code. Finally, it can use real-time conversion to other architectures to run simple programs, similar to Wine in Linux. Because QEMU does not have a graphical user interface (GUI), and the core competencies it provides are critical and important, it is often used as part of a more complex virtualization manager. For example, we often use open source VirtualBox, Xen and other virtualization products, the core of the underlying virtualization part is the integration and use of QEMU, in addition, the mainstream KVM virtualization is also the main virtualization manager system to integrate and use QEMU.
From the perspective of KVM, KVM (Kernel Virtual Machine) is a kernel driver module of Linux, which enables the Linux host to become a Hypervisor (virtual machine monitor). In the x86 processors that support VMX (Virtual Machine Extension) function, Linux adds client mode to the original user mode and kernel mode, and the client mode also has its own kernel mode and user mode, and the virtual machine runs in the client mode. The responsibility of the KVM module is to open and initialize the VMX function and provide the corresponding interface to support the operation of the virtual machine. By invoking the kernel function of Linux itself, KVM realizes the underlying virtualization of CPU and the virtualization of memory, and makes the Linux kernel become a virtualization layer. KVM was imported into the Linux 2.6.20 kernel in February 2007. In terms of existence form, it includes two kernel modules: kvm.ko and kvm_intel.ko (or kvm_amd.ko). In essence, KVM is a driver for managing virtual hardware devices. The driver uses character device / dev/kvm (created by KVM itself) as the management interface, which is mainly responsible for the creation of vCPU, the allocation of virtual memory, the reading and writing of vCPU registers and the operation of vCPU.
From the perspective of QEMU, QEMU (Quick Emulator) itself does not contain or rely on KVM modules, but is a set of free software for analog computers written by Fabrice Bellard. QEMU virtual machine is a pure software implementation, which can run independently without KVM module, but its performance is relatively low. QEMU has a complete set of virtual machine implementations, including processor virtualization, memory virtualization, and virtualization of Icano devices. Without the need for KVM acceleration, QEMU translates the binaries of a particular processor through a special "recompiler", which makes it universal across platforms. QEMU has two working modes: system mode, which can simulate the entire computer system, and user mode, which can run programs on other platforms that are different from the current hardware platform (such as running on x86 platform and running on ARM platform). The latest version is 4.x. From the QEMU point of view, during the operation of the virtual machine, QEMU sets the kernel through the system call interface provided by the KVM module, and the KVM module is responsible for running the virtual machine in the VMX mode of the processor. QEMU uses the virtualization capabilities of the KVM module to provide hardware virtualization acceleration for its virtual machines to improve the performance of their virtual machines.
And now the popular KVM virtualization platform, is to modify the QEMU code, his simulation CPU, memory code into KVM, while the network card, monitor, etc., so QEMU+KVM has become a complete virtualization platform. Because KVM runs in kernel space, and QEMU runs in user space, it actually simulates and manages all kinds of virtual hardware (disks, network cards, graphics cards, etc.). From the perspective of KVM, users can not directly interact with the kernel module, and need to use the management tool of user space, so they need to use QEMU, which runs in user space. KVM and QEMU complement each other. QEMU achieves the speed of hardware virtualization through KVM, while KVM simulates devices and interacts with KVM in kernel space through QEMU, although this interaction is not limited to QEMU. In addition, due to the inefficiency of QEMU in simulating IO devices, semi-virtualized virtio is often used to virtualize IO devices.
In summary, we understand the relationship between QEMU and KVM, and then we understand the relationship between the integration of virtualization products such as VirtualBox and Xen and the use of QEMU.
II. Structure and composition of QEMU
The architecture of QEMU is shown in the following figure and consists of several basic components:
Figure QEMU architecture diagram
As shown in the figure, QEMU consists of the following parts:
L Hypervisor control simulation
L Tiny Code Generator (TCG) converts between virtual machine code and host code.
The software memory management unit (MMU) handles memory access.
L disk subsystem handles different disk image formats
L equipment subsystem handles network cards and other hardware devices
These components are described below.
2.1 Hypervisor hypervisor
Hypervisor (Hypervisor) is a virtual machine monitor that creates and runs virtual machines. The Hypervisor (hypervisor) in QEMU loads binary machine code from the disk image, converts it to native machine code using TCG, connects to a virtual or real device, starts the software MMU, and then begins to simulate the operating system in the disk image. Among them, TCG and software MMU are the key to virtualizing CPU and memory.
When KVM is integrated, QEMU uses the KVM function of the Linux kernel to execute the virtual machine in native mode. KVM is basically the Hypervisor (hypervisor) in the Linux kernel. It can run multiple operating systems in parallel. QEMU can start a new thread in KVM to execute a mock operating system, and then KVM controls execution. In this part, KVM's Hypervisor (hypervisor) replaces QEMU's Hypervisor (hypervisor).
3.2Microcode generator (TCG)
In QEMU, Tiny Code Generator (TCG) converts the source processor machine code into the machine code blocks (such as x86 machine code blocks) needed by the virtual machine to run. From the architecture and perspective of physical hardware, it is not possible to run machine code compiled for instruction set architecture (ISA) of another processor on one processor, such as ARM machine code on x86 processors. Therefore, the introduction of intermediate links to translate and transform different processor instruction set architectures (ISA) is a technical approach and solution to realize the versatility of virtualization. In Tiny Code Generator (TCG), these translated blocks of code are placed in the conversion cache and the instruction set (ISA) of the source processor and the instruction set (ISA) of the target processor are linked together by jump instructions. When the Hypervisor (hypervisor) executes code, the link instructions stored in the conversion cache can jump to a specified code block, and execution can run on different translated code blocks until a new block needs to be translated. In the process of execution, if a block of code that needs to be translated is encountered, the execution action will be paused and jumped back to Hypervisor (hypervisor), and Hypervisor (hypervisor) will use and coordinate TCG to convert and translate the source processor instruction set (ISA) that requires binary translation and store it in the transformation cache.
The following figure shows how TCG works for QEMU:
Picture. How the microcode generator works
A small disadvantage of TCG in running is that it does not run the self-modified code correctly because it does not mark the modified code page and needs to be retranslated when it is run again. This affects the binary operation efficiency of QEMU, and on the other hand, it also adds some security. Self-modified code is easy to be exploited in the software world. Especially buffer overflow attack & hit and other memory damage vulnerabilities, these vulnerabilities take advantage of the special code provided by threat agents (such as backdoors) to cover vulnerable application code. If the overwritten code has been run (and therefore cached), normal operation will lead to loopholes & attack utilization, and more often will lead to TCG operation and translation failure, resulting in program recurrence exception or crash.
In addition, if the new processor uses more registers than the x86 processor and has many complex instructions during translation, programming TCG to handle and adapt to the new CPU emulation may require a lot of work. Currently, most of the processors supported by QEMU have partially the same instruction set. For example, the "MOV" instruction exists in almost all processors and can be simply copied unless there is some bit size difference in the CPU register. For example, simulating a 64-bit processor on a 32-bit processor may require many additional instructions, which also requires more time to program in the TCG converter.
In the source code of QEMU, there is a subdirectory called 'tcg', which contains code to convert machine instructions into corresponding x86 machine instructions. This code is a simple translation state machine written in C. There are also special conversions for memory access and jump, because they can generate calls to the software memory management unit. The virtualized CPU and memory are often together, because in essence, the job of CPU is to move the area data of memory, and CPU is the porter of memory. Other areas of memory outside the block of code protected by QEMU. Jumps and branches in the machine code must also reach the correct memory address.
So through binary translation technology, simulation and virtualization for CPU is very simple. TCG and Hypervisor (hypervisor) can implement CPU-based simulation, where the CPU simulation process is shown in the following figure:
As we can see from the figure above, the simulation and virtualization for CPU is to convert and translate the instruction set (ISA) of the source processor into the instruction set (ISA) of the target processor. CPU simulation and virtualization are achieved through intermediate transformation and translation, so the first technology of virtualization for CPU is fully implemented. This binary translation technology is the earliest CPU virtualization technology, the birth of virtualization giants such as VMware, but also the birth of open source virtualization like QEMU.
2.3 hardware Devic
The hardware requirements of the virtual machine can be realized by directly connecting the actual physical device in the host or through the hardware device simulation in QEMU. Most of the hardware-related QEMU code is located in the directory "hw".
In QEMU, there are two ways to use hardware devices: cut-through mode uses the actual physical device of the host and the simulated virtual device realized by the device driver simulation of QEMU. If the actual physical device is used in a cut-through way, the device usage right of the host will be preempted, and other virtual machines will not be able to use the physical device. In pass-through mode, virtual machines can directly access USB bus or PCI bus and communicate with devices directly. In general, physical devices with cut-through mode are difficult to simulate with QEMU, such as webcams, serial and parallel ports, etc. Other devices because most virtual machines are used and are difficult to share with hosts, such as network devices, most of them use QEMU to simulate virtual devices. For example, in the network device of a virtual machine, it can be solved by simulating the network card, thus adding additional layers to the network stack. In addition, QEMU can choose to connect to the "virtio" paravirtualized driver in the Linux kernel, which means that the Linux kernel handles the input / output between the virtual machine and the hardware device, rather than using QEMU's analog device for transit and transfer (only as an intermediary).
2.4 disk Ima
QEMU can handle several different disk image formats. The preferred format is raw or qcow2. Raw is a very simple format that stores bytes from the file system in a file byte by byte. This format is supported by most other simulators. Qcow2 is QEMU's own image format and is useful for small images. It also supports disk image compression and snapshots of disk image status. Two other formats are also supported: vdi used in VirtualBox and vmdk used in VMWare.
The disk image of QEMU is supported by its storage IO protocol stack, which is shown in the following figure:
Figure QEMU storage protocol stack
In terms of QEMU's storage protocol stack, the application and virtual machine kernels work like bare metal. The virtual machine interacts with QEMU through the simulation hardware, and interacts the control flow and data flow of IO execution to QEMU,QEMU to perform I / O operations on the disk image file on behalf of the virtual machine. From the host kernel level, the host kernel will treat the virtual machine I / O as a user space application IO request for normal execution.
2.5 Software MMU
The memory management unit (MMU) in a traditional processor handles access to the computer's memory location. When the processor wants to access a memory address, MMU gets the contents of that address. This content can come from a local fast cache on the processor chip, from a random access memory (RAM), or from a CD. It can even make some control decisions about caching certain memory locations.
QEMU has a software-based MMU that works similar to hardware MMU. It uses the address translation cache, which contains guest addresses, host addresses, and offset values to improve translation speed. It also allows code blocks to be intelligently linked for faster execution without memory failures, where memory blocks must be reloaded and reconverted.
When looking for vulnerabilities in virtual machines running in QEMU, whether the software MMU is translating and correctly placing blocks will be the focus of its testing and Fuzz.
III. Summary
In fact, to figure out the technical architecture and implementation details of QEMU, we need to understand the architecture and composition of QEMU, as well as the role and operating mechanism of each component. In addition, we also need to understand the interaction between each component, which is mainly control flow and data flow from the point of view of data flow; from the point of view of IO, it is mainly network IO and storage IO, and from the technical implementation mechanism, it is mainly the implementation of virtualized CPU and memory as well as storage and network protocol stack. There are many outstanding issues in this article, please follow-up to add.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.