How to analyze the architecture and performance of numa 07/01 Update SLTechnology News&Howtos

How to analyze the architecture and performance of numa

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

How to analyze the architecture and performance of numa? in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

When it comes to the NUMA architecture of the server, you probably know it. NUMA architecture has always been very popular in medium and large systems, and it is also a high-performance solution, especially in terms of system latency. But what exactly is the impact of NUMA architecture on server performance, and how can it be better set up? The editor will analyze it here.

1. What is NUMA

NUMA (Non Uniform Memory Access Architecture) is a computer memory design for multiprocessors, and the memory access time depends on the memory location of the processor. Under NUMA, a processor accesses its own local memory faster than non-local memory (a processor or memory shared between the place of memory and another processor).

The NUMA architecture logically follows the symmetric multiprocessing (SMP) architecture. It was developed in the 1990s by developers including Burruphs (Youli system), Convex Computer (Hewlett-Packard), Italy's Honeywell Information system (HISI) (later Group Bull), Silicon Graphics (later Silicon Valley Graphics), Sequent computer system (later IBM), General data (EMC), Digital (later Compaq, HP). The technologies developed by these companies later shone brilliantly in Unix-like operating systems and were applied to Windows NT to some extent.

The main advantage of NUMA is scalability. The NUMA architecture has been designed to exceed the scalability limitations of the SMP architecture. With SMP, all memory access is passed to the same shared memory bus. This approach is ideal for situations where the number of CPU is relatively small, but not for situations with dozens or even hundreds of CPU, because these CPU compete with each other for access to the shared memory bus. NUMA alleviates these bottlenecks by limiting the number of CPU on any memory bus and relying on high-speed interconnections to connect nodes.

2. Several architecture schemes of NUMA

NUMA systems are generally more economical and have higher performance than consistent memory access systems (UMA). Consistent memory access systems must equally provide memory for all CPU, while NUMA systems can provide high-speed interconnections for memory directly connected to the CPU, while providing cheaper but higher latency connections for memory farther away from the CPU.

When using NUMA, you often encounter the following scenarios, taking the SQL SERVER database as an example. (see technet materials)

a. There is no port to NUMA association

This is the default setting on computers with hardware NUMA and a single instance of SQL Server. All traffic is entered through a single port and distributed to any available NUMA node in a circular manner. NUMA increases the area of memory and CPU access and increases the number of Istroke O and lazy writer threads. Immediately after the connection is established, its scope is limited to this node. It provides automatic load balancing between NUMA nodes. Client applications can connect to a single port and can be easily deployed.

b. Associate a single port to multiple nodes to improve the performance of major applications

Associate a port to multiple hardware NUMA nodes for the primary application. Associate the second port to another hardware NUMA node for the second secondary application. The amount of memory and CPU resources used for these two applications is very uneven, with three times as much local memory and CPU resources for the primary application as for the secondary application. A secondary application can be a second instance of the database engine, which provides secondary functionality in the same instance of the database engine or even in the same database. By providing additional resources to preferred connections, it provides a way for threads to execute first.

c. Associate multiple ports to multiple nodes

Multiple ports can be mapped to the same NUMA node. This allows you to configure different permissions for different ports. For example, you can strictly restrict access provided by a port by controlling permissions on the corresponding TCP endpoints. In this example, port 1450 is generally available on Intranet. Port 1433 is set to connect to the Internet through a firewall, and its access is strictly restricted. Both ports can make full, equal, and secure use of NUMA.

3. How to set it up and what principles to follow

So how do you set up NUMA in a virtualization scenario? What is the principle?

For example, a server with two eight-core processors and 128GB memory needs to allocate CPU and memory resources and divide virtual machines.

First of all, in the NUMA architecture, each processor can control the physical memory of the 64GB, and each of the eight cores of each processor will correspond to a NUMA node of 8GB. How will this affect virtual machine performance? Because each processor core accesses the memory in the NUMA node faster than the other nodes, when the memory size of the virtual machine is less than or equal to the memory size of the NUMA node, the virtual machine can theoretically get the best performance. If more memory is allocated to the virtual machine, the virtual machine is bound to access part of the memory outside its NUMA node, which will more or less affect its performance. If the application is aware of NUMA, so much the better. VSphere uses vNUMA to create NUMA-aware virtual machines. The virtual machine will be divided into virtual NUMA nodes, and each vNUMA node will be placed on a different physical NUMA node. Although the virtual machine still scales between the two NUMA nodes, the operating system and applications within the virtual machine are NUMA-aware and resource usage will be optimized.

NUMA has made a lot of changes to the way memory is installed and selected on data center servers. When adding physical memory to the server, we need to pay attention to balancing and matching the increased memory between the NUMA nodes so that each processor on the motherboard has the same memory. If you configure more memory on the server we exemplified, you must balance these memory modules between processors. If you increase the memory of 64GB, each processor will allocate memory to 32GB (the memory available per processor will increase to 96GB, and the total memory of the server will reach 192GB), and the memory size of each NUMA node will increase from 8GB to 12GB. Since the memory slots controlled by each Socket are different, make sure the memory slots are uniform. For example, 192 gigabytes of memory is divided into 12 16 gigabytes of memory, then four should be inserted in one Socket memory slot and the other eight in the other two socket memory slots. When allocating vCPU resources to virtual machines, try to allocate them in multiples of Socket/Core, such as 1X1, 1X2, 1x 4, 1X8, 2X1, 2X2, 2X4, 2X8, etc., but do not use the combination of 2X3, 2X5, and 2X7. The latter combination results in memory calls across Socket, which can easily lead to performance degradation.

Combined with practice, different businesses have different requirements for memory, but it is best not to make calls across NUMA units, and to make each CPU access its directly connected memory units as much as possible. Following these simple principles will lead to better performance.

This is the answer to the question on how to parse the architecture and performance of numa. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.