In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article to share with you is about how to carry out X86 server virtualization resource division and performance optimization, Xiaobian think quite practical, so share to everyone to learn, I hope you can read this article after some harvest, not much to say, follow Xiaobian to see it.
Overview: Virtualization is a broad term that generally refers to computing elements running on a virtual basis rather than a real one, and is a solution to simplify management and optimize resources. Server virtualization is a technology used to consolidate x86-based servers to improve resource efficiency and performance. From the perspective of enterprise business system and management, the following focuses on analyzing and studying the characteristics of virtual network cards and SR-IOV, NUMA, and virtual disk formats under the X86 technology architecture, and explores resource partitioning and performance optimization schemes under different application scenarios. It is hoped that the performance and resource utilization efficiency of X86 servers can be improved through practice and optimal configuration under multiple application systems.
1x86 virtualization Two common architectures
For x86 virtualization, there are two common architectures: host architecture and bare metal architecture. Hosting architecture runs the virtualization layer on top of the operating system as an application, with extensive hardware support. In contrast, bare metal architecture directly runs the virtualization layer on x86 hardware systems, allowing direct access to hardware resources without the need for hardware access through the operating system, so it is more efficient.VMware Workstation and VMware Server are both implemented based on host architecture, while VMware ESX Server is the industry's first bare metal architecture virtualization product, and has released its fifth generation.ESX Server needs to run on VMware certified hardware platforms, providing excellent performance. It can completely meet the performance requirements of large data centers. This paper mainly discusses resource partition and performance optimization of server based on X 86 bare metal architecture.
2 x 86 Three levels of virtualization resource partitioning
Server resources are divided into three levels: network, computing and storage. Each virtual machine undertakes certain computing tasks in its connected network and stores the computed data for business use.
2.1 network level
3.
From the network level, the X86 physical machine uses a physical NIC and connects to a physical switch. After an X86 is divided into multiple VMs, virtual network cards and virtual switches are born. This creates traffic and interaction between virtual and physical networks. As shown in Figure 1.
VMs on the same physical machine can be divided into the same network segment and different network segments. According to whether the network traffic between VMs passes through the physical NIC, it can be divided into four different situations:
In the first case, for example, if the VMs of a business system are on the same network segment of the same host, the network traffic between the virtual machines does not pass through the host physical NIC, and the maximum network traffic is 7.6GB. (Test method: enable jperf server as network data receiver on testvm1, enable jperf client to connect jperf server to send network data packets in testvm2, and pressurize network traffic at the same time. X86 host is dual 10 Gigabit NIC);
In the second case, for example, if the VMs of a business system are on different network segments of the same host, the network traffic between virtual machines passes through the host physical NIC, and the maximum network traffic is 5.6GB. The test method is the same as above.
In the third case, for example, if the VMs of a business system are on the same network segment of different hosts, the network traffic between virtual machines passes through the host physical NIC, and the maximum network traffic is 6.5GB. The test method is the same as above.
In the fourth case, for example, if the VMs of a certain business system are on different network segments of different hosts, the network traffic between virtual machines passes through the physical NIC of the host, and the maximum network traffic is 4.6GB. The test method is the same as above.
A comparison table of several scenarios tested is shown in Table 1.
SR-IOV technology is a hardware-based virtualization solution proposed by INTEL to improve performance and scalability.SR-IOV standard allows PCIe (Peripheral Component Interconnect Express) devices to be efficiently shared between virtual machines, and it is implemented in hardware to achieve network I/O performance comparable to native performance. For example, if we divide the 10 Gigabit NIC on an X86 physical server into 4 virtual NIC for 4 VMs through SR-IOV technology, then its network transmission performance will be much higher than that of virtualized NIC for VM use.
Test method: On one X86 physical server, 4 VMs enable jpf server as network data receiver; on another X86 physical server, 4 VMs enable jpf client to connect to jpf server to send network data packets and pressurize network traffic. Two X 86 hosts are dual 10 Gigabit Ethernet cards.
The SR-IOV virtualization test architecture is shown in Figure 2.
The comparison of data volume transmitted by network is shown in Table 2.
Normal virtualization transfers a maximum of 4.6Gbps, while SR-IOV's direct hardware virtualization can reach 9.4Gbps.
SR-IOV devices also have the following advantages: energy saving, reduced adapter count, simplified cabling, and reduced switch ports.SR-IOV has many advantages, but there are also many limitations, such as VMWARE many of the original features will not be available to SR-IOV virtual machines. Examples include Vmotion, Storage Vmotion, Vshield, NetFlow, High Availability, FT, DRS, DPM, suspend and resume, snapshot, hot add and delete virtual devices, join to cluster environment.
Therefore, when we consider x86 network virtualization, more needs to be considered in combination with performance, service characteristics and infrastructure. If a service requires higher performance without more flexibility, SR-IOV technology can be considered. Otherwise, choose X86 common network virtualization technology, combined with VMWARE to deploy.
2.2 computational plane
From the computing level, CPU and memory resources on the X86 physical server can be provided to virtual machines. Today's high-performance X86 servers are typically multi-CPU, multi-core systems, and NUMA architecture will become increasingly popular because it addresses the challenges of new CPU and memory resource allocation methods created by the interaction between multiprocessor, multi-core and non-unified memory architectures, and improves performance for memory-intensive workloads. NUMA architecture is shown in Figure 3.
Traditional server architectures put memory into a single pool, which works well for single-processor or single-core systems. However, this traditional uniform access method will lead to resource contention and performance problems when multiple cores access memory space simultaneously. NUMA is a new architecture for server CPU and memory design that changes the way memory is presented to the CPU by partitioning the memory of each CPU of the server. Each partition (or memory block) is called a NUMA node, and the processor associated with that partition can access NUMA memory faster without competing with other NUMA nodes for resources on the server (other memory partitions are allocated to other processors).NUMA also supports any processor accessing any memory region on the server. A processor can of course access memory data located in different regions, but requires more transfers outside of the local NUMA node and requires acknowledgement of the target NUMA node. This increases the overall overhead and affects CPU and memory subsystem performance.
For example, a server configured with two eight-core processors and 128GB of memory. In NUMA architecture, each processor can control 64GB of physical memory, and each of the eight cores of each processor will correspond to an 8GB NUMA node. How will this affect VM performance? Because each processor core accesses memory within NUMA nodes faster than other nodes, virtual machines theoretically achieve the best performance when their memory size is less than or equal to the memory size of NUMA nodes. So when we allocate VMs on this physical server, don't allocate more than 8GB of memory per VM. If you allocate more memory to a virtual machine, the virtual machine must access some of its memory outside of its NUMA nodes, which affects its performance to a greater or lesser extent. If your application is NUMA aware, that's even better.vSphere uses vNUMA to create NUMA aware virtual machines. The virtual machine will be partitioned into virtual NUMA nodes, and each vNUMA node will be placed on a different physical NUMA node. Although the VM still scales between two NUMA nodes, the OS and applications within the VM are NUMA aware and resource usage will be optimized.
NUMA has brought about many changes in the way memory is installed and selected on data center servers. When adding physical memory to a server, we need to be careful that the added memory is balanced and matched between NUMA nodes so that each processor on the motherboard has the same memory. If more memory is allocated on the server we have exemplified, then these memory modules must be balanced between processors. If you add 64GB of memory, each processor will be allocated 32GB of memory (the memory available per processor will increase to 96GB, and the total number of servers will reach 192GB), and the memory size of each NUMA node will increase from 8GB to 12 GB.
In combination with VMWARE best practices, VMware generally recommends CPU support, up to 64 vCPUs, generally no more than 32, preferably not over-configured; memory generally does not give advice, according to different services on memory size will have different requirements, of course, it is best not to cross NUMA units to make calls. Also note that NUMA architecture only targets physical CPUs (Socket), not cores. Since each Socket controls a different memory slot, make sure the memory slots are evenly distributed. For example, 128G memory is divided into 8 16G memory banks, then 4 should be inserted into the memory slot of one Socket, and the other 4 should be inserted into the memory slot of another socket. When allocating vCPU resources for virtual machines, try to allocate them according to multiples of Socket/Core, such as combinations of 1X1, 1X2, 1X4, 1X8, 2X1, 2X2, 2X4, 2X8, etc., but do not use combinations of 2X3, 2X5, 2X7. The latter combination can cause memory calls across Sockets, which can easily lead to performance degradation.
2.3 storage level
At the storage level, VMs on X86 physical servers connect to LUNs that are zoned from backend storage. There are three ways to create virtual disks on Lun: thick provisioning latency zeroed, thick provisioning zeroed, and thin provisioning. As shown in Figure 4.
Thick provisioning latency zeroed thick Create virtual disks in the default thick format. Allocate all required space for virtual disks during creation. Any data retained on the physical device is not erased at creation, but is zeroed out as needed on subsequent first writes from the virtual machine. Simply put, it is to allocate the specified size of space immediately, the data in the space is not emptied temporarily, and it will be emptied on demand later; thick provisioning is zeroed (eager zeroed thick) to create a thick disk that supports clustering functions (such as Fault Tolerance). Allocate the required space for virtual disks at creation time. As opposed to flat format, data retained on physical devices is zeroed out during creation. Creating disks in this format may take longer than creating other types of disks. Simply put, it immediately allocates a specified size of space and empties all data in that space; thin provisioning uses the thin provisioning format. Initially, a thin-provisioned disk uses only the data storage space that the disk originally needed. If a thin disk needs more space later, it can grow to the maximum capacity allocated to it. Simply put, it specifies the maximum space for the disk file to grow, and checks whether it exceeds the limit when it needs to grow.
In addition, thin provisioning formats can have some negative performance impacts when used in VMs compared to thick provisioning formats. This is because the disk in thin provision format is dynamically expanded, and a vmdk file with a size of several GB is not generated once on the disk, so it does not occupy continuous disk space like the disk in thick provision format. Therefore, when accessing the disk in thin provision format, the addressing time will inevitably be longer because the head moves between discontinuous disk blocks, thus affecting Disk IO performance.
To sum up, whether in deployment or application, the performance of thin provision format is not as good as thick provisioning, so it is recommended to use thick provisioning format virtual disks when space is not tight enough.
3 How to optimize x86 virtualization performance in combination with business
For example, a linux postfix mail system, including mail servers, databases and networks. One of the biggest problems with looking at mail systems from disk is that it's not a lot of large file reads and writes, but a lot of small file reads and writes, and these read and write requests come from multiple processes or threads at the same time. For this kind of small file read and write application service, it is recommended to use Thin provision mode when allocating the disk where the mail user is located. This not only avoids a large amount of initial space occupation, but also can achieve growth with demand.
From the memory point of view, for postfix, each of its processes does not consume too much memory, we expect a lot of memory to be automatically used in the disk cache to improve disk I/O speed, of course, we do not need to operate, linux helped us complete! Linux Virtual Memory Management treats all free memory space as hard disk cache by default. Therefore, in a production Linux system with several GB of memory, you can often see that the available memory is only 20MB. Looking at the mail system from the processor, whether it is smtp or imap, the CPU consumption is not very large. In this way, when allocating CPU and memory resources, we can configure fixed-size units according to NUMA architecture. For example, a server configured with two eight-core processors and 128GB of memory can be virtualized into four mail servers, each of which can be allocated into 4 cores 32G.
From the network point of view, the mail system will frequently use the network subsystem, but the bottleneck of the mail system is disk throughput rather than network throughput, corresponding to this application does not require strong interaction, delay is allowed, so whether the NIC is virtual or SR-IOV has little impact.
For the database server of mail system, because small files are read and written randomly, the disk of database can choose thick provisioning mode to improve the IO.
For different business systems, specific problems need to be analyzed in detail. Performance optimization is not an overnight matter. With the development and change of business, the technical means and methods of optimization will change accordingly.
4 Virtualization of x86 servers from an enterprise day-to-day use and management perspective
Different enterprise applications have different CPU and memory resource and space utilization. How to utilize NUMA architecture to optimize resource allocation and improve performance is also very meaningful for enterprise data center management. See table 3.
For database servers, due to the high requirements on CPU and memory resources, it is not suitable for multi-machine shared resources, so physical machines with good configuration are used as much as possible. For VDI desktops and file servers, it is more suitable for the allocation of fixed CPU and memory units under NUMA architecture. Mail systems need to allocate resources under NUMA architecture according to specific circumstances. For websites that change according to needs, they may not all be suitable for NUMA. For example, cache servers in websites are more suitable for memory allocation under non-NUMA architecture. When allocating disk space, thick provisioning space allocation is suitable for business systems with relatively high IO performance requirements; thin provisioning space allocation is suitable for business systems with low IO performance requirements and small business growth space.
X86 server virtualization is a technology to consolidate server resources and improve efficiency. X86 virtualization can bring higher server hardware and system resource utilization, a highly reliable server application environment with transparent Load Balancer, dynamic migration, automatic fault isolation, and automatic system reconfiguration, as well as a simpler, Unified server resource allocation management mode. X86 server virtualization performance optimization after resource division also greatly improves the overall resource utilization rate of the data center, in line with today's new concept of green energy saving.
The above is how to divide resources and optimize performance of X86 server virtualization. Xiaobian believes that some knowledge points may be seen or used in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.