Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Mongodb numa problem

2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

When mongodb migrates data, it usually needs to import old data. Everything was fine at first, but a few hours later, I noticed that the speed of data import slowed down, and my script started throwing exceptions. I repeated the process of migrating old data several times, and the result was naturally the same, but I found that whenever something went wrong, there was always a process named irqbalance with a high CPU occupancy rate. I searched it and found that many articles about irqbalance mentioned NUMA, which reminded me of the warning message I saw in the mongodb log. Maybe this problem really has something to do with NUMA.

Warning message in the mongodb log:

Fri May 10 16:17:48 [initandlisten] MongoDB starting: pid=18486 port=27017 dbpath=/backup/mongodbData 64-bit host=sqcache02

Fri May 10 16:17:48 [initandlisten]

Fri May 10 16:17:48 [initandlisten] * * WARNING: You are running on a NUMA machine.

Fri May 10 16:17:48 [initandlisten] * * We suggest launching mongod like this to avoid performance problems:

Fri May 10 16:17:48 [initandlisten] * * numactl-- interleave=all mongod [other options

So what exactly is NUMA? Let's find out.

Transferred from the network:

First, NUMA and SMP are two CPU-related hardware architectures. In SMP architecture, all CPU compete for one bus to access all memory. The advantage is resource sharing, while the disadvantage is that bus competition is fierce. As the number of CPU on PC servers becomes more and more (not just the number of CPU cores), the disadvantages of bus contention become more and more obvious, so Intel launched the NUMA architecture on Nehalem CPU, while AMD also introduced Opteron CPU based on the same architecture.

The most important feature of NUMA is the introduction of the concepts of node and distance. For the two most valuable hardware resources, CPU and memory, NUMA divides the resource group (node) in an almost strict way, and the CPU and memory in each resource group are almost equal. The number of resource groups depends on the number of physical CPU (most existing PC server have two physical CPU, each CPU has 4 cores); the concept of distance is used to define the cost of invoking resources between each node and to provide data support for resource scheduling optimization algorithms.

NUMA and SMP are two CPU-related hardware architectures. In SMP architecture, all CPU compete for one bus to access all memory. The advantage is resource sharing, while the disadvantage is that bus competition is fierce. As the number of CPU on PC servers becomes more and more (not just the number of CPU cores), the disadvantages of bus contention become more and more obvious, so Intel launched the NUMA architecture on Nehalem CPU, while AMD also introduced Opteron CPU based on the same architecture.

The most important feature of NUMA is the introduction of the concepts of node and distance. For the two most valuable hardware resources, CPU and memory, NUMA divides the resource group (node) in an almost strict way, and the CPU and memory in each resource group are almost equal. The number of resource groups depends on the number of physical CPU (most existing PC server have two physical CPU, each CPU has 4 cores); the concept of distance is used to define the cost of invoking resources between each node and to provide data support for resource scheduling optimization algorithms.

II. Strategies related to NUMA

1. Each process (or thread) inherits the NUMA policy from the parent process and assigns a priority node. If the NUMA policy allows, the process can call resources on other node.

2. The CPU allocation strategies of NUMA include cpunodebind and physcpubind. Cpunodebind specifies that the process runs on several node, while physcpubind can specify more precisely which cores to run on.

3. The memory allocation strategies of NUMA include localalloc, preferred, membind and interleave. Localalloc specifies that the process requests memory allocation from the current node; preferred loosely specifies a recommended node to acquire memory, and if there is not enough memory on the recommended node, the process can try another node. Membind can specify several node, and the process can only request memory allocation from these specified node. Interleave specifies that the process interweaves requests for memory allocation with the RR algorithm from a specified number of node.

III. The relationship between NUMA and swap

NUMA's memory allocation strategy is not fair between processes (or threads). In existing Redhat Linux, localalloc is the default NUMA memory allocation policy, and this configuration option makes it easy for a resource monopolizer to run out of memory for a node. When a node runs out of memory, Linux just allocates the node to a process (or thread) that consumes a lot of memory, and the swap is properly generated. Although there is still a lot of page cache to release at this time, there is even a lot of free memory.

Fourth, the meaning of NUMA, to put it simply, in the framework of multiple physical CPU, NUMA divides memory into local and remote, each physical CPU has its own local memory, and the speed of accessing local memory is faster than accessing remote memory. By default, each physical CPU can only access its own local memory. For a service like MongoDB that requires a lot of memory, it may result in insufficient memory, which in turn affects performance.

Finally, let's take a look at how to solve this problem:

# echo 0 > / proc/sys/vm/zone_reclaim_mode

At startup, add parameters

# numactl-interleave=all / data/mongodb/bin/mongod-f / app/mongodb.conf-dbpath=/data/db-fork-logpath=/data/logs/mongodb.log

In this way, there will be no warning message when you log on to MongoDB.

(# numactl-- interleave=all mongod-f / app/mongodb.conf

Forked process: 18523

All output going to: / app/mongodb/log/mongodb.log

Child process started successfully, parent exiting)

* SERVER RESTARTED *

Fri May 10 16:33:15 [initandlisten] MongoDB starting: pid=18523 port=27017 dbpath=/backup/mongodbData 64-bit host=sqcache02

Fri May 10 16:33:15 [initandlisten] db version v2.2.6, pdfile version 4.5

Fri May 10 16:33:15 [initandlisten] git version: d626379119a6de9f2fb390780cf2fc336dfd540d

Fri May 10 16:33:15 [initandlisten] build info: Linux ip-10-2-29-40 2.6.21.7-2.ec2.v1.2.fc8xen # 1 SMP Fri

For instructions on the zone_reclaim_mode kernel parameters, you can refer to the official documentation.

Note: starting with MongoDB1.9.2: MongoDB automatically sets zone_reclaim_mode at startup without manual change.

The performance impact of abnormal startup of mongodb on the CPU of NUMA architecture has not been verified, and some experiences of others can be found on the Internet. The official documentation (see MongoDB Documentation, Release 2.4.1 12.8.1 MongoDB on NUMA Hardware) states as follows:

To explain briefly, the memory allocated to each core in NUMA architecture is faster than that allocated to other cores. There are several access control policies:

Default (default): always assigned on the local node (on the node on which the current process is running)

Binding (bind): force assignment to a specified node

Interleave: interweaves assignments on all nodes or on specified nodes

Priority (preferred): assigned on the specified node, failure assigned on other nodes.

But at present, mongodb does not work very well under this architecture. Numactl-interleave=all is to disable NUMA to allocate each core separately.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report