How to solve the problem of high soft interrupt on Nginx server 11/01 Update SLTechnology News&Howtos

How to solve the problem of high soft interrupt on Nginx server

2025-11-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "how to solve the problem of high soft interrupt on Nginx server". In daily operation, I believe that many people have doubts about how to solve the problem of high soft interrupt on Nginx server. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the doubt of "how to solve the problem of high soft interrupt on Nginx server". Next, please follow the editor to study!

A few days ago, it was found that there was a problem with the Nginx server on the XEN virtual machine: the soft interrupt was too high, and most of them were concentrated in the same CPU. Once the system was busy, this CPU would become a deficiency in the bucket.

Running the "top" command on the problem server shows that there is something wrong with "si". Most of the soft interrupts are concentrated on CPU 1, and the other CPU is completely weak:

Shell > topCpu0: 11.3%us, 4.7%sy, 0.0%ni, 82.5%id,... 0.8%si, 0.8%stCpu1: 21.3%us, 7.4%sy, 0.0%ni, 51.5%id,... 17.8%si, 2.0%stCpu2: 16.6%us, 4.5%sy, 0.0%ni, 77.7%id,... 0.8%si 0.4%stCpu3: 15.9%us, 3.6%sy, 0.0%ni, 79.3%id,... 0.8%si, 0.4%stCpu4: 17.7%us, 4.9%sy, 0.0%ni, 75.3%id,... 1.2%si, 0.8%stCpu5: 23.6%us, 6.6%sy, 0.0%ni, 68.1%id,... 0.9%si 0.9%stCpu6: 18.1%us, 4.9%sy, 0.0%ni, 75.7%id,... 0.4%si, 0.8%stCpu7: 21.1%us, 5.8%sy, 0.0%ni, 71.4%id,... 1.2%si, 0.4%st

After querying the data related to soft interrupt, it is found that it is mainly focused on NET_RX, and the guess is that it is a network card problem:

Shell > watch-d-n 1 'cat / proc/softirqs' CPU0 CPU1 CPU2. CPU7 HI: 00 0... 0 TIMER: 3692566284 3692960089 3692546970... 3693032995 NET_TX: 130800410 652649368 154773818... 308945843 NET_RX: 443627492 3802219918 792341500... 2546517156 BLOCK: 00 0... 0BLOCK_IOPOLL: 00 0... 0 TASKLET : 0 0 0... 0 SCHED: 1518716295 335629521 1520873304... 1444792018 HRTIMER: 160 1351 131... 196 RCU: 4201292019 3982761151 4184401659... 4039269755

Add: there is a top-style gadget called itop that can view Interrupt. It is recommended to try it.

Confirm the Nic information on the host and find that it is running in single queue mode:

Shell > grep-A 10-I network / var/log/dmesgInitalizing network drop monitor serviceIntel (R) Gigabit Ethernet Network Driver-version 3.0.19igb 0000 network 05VOO: Intel (R) Gigabit Ethernet Network Connectionigb 0000VON05VOL00.0: eth0: (PCIe:2.5GT/s:Width x4) 00:1b:21:bf:b3:2cigb 0000VOOV 00.0: eth0: PBA No: G18758-002igb 0000VOL000.0: Using MSI-X... 1 rx queue (s) 1 tx queue (s) igb 0000 eth2: (PCIe:2.5GT/s:Width x4) 00:1b:21:bf:b3:2digb 0000 Gigabit Ethernet Network Connectionigb 05VO0.1: eth2: PBA No: G18758-002igb 0000VOO5 rx queue 00.1: Using MSI-X... 1 rx queue (s), 1 tx queue (s)

Then confirm the interrupt number of the network card. Because it is a single queue, there is only one interrupt number 45:

Shell > grep eth / proc/interrupts | awk'{print $1, $NF}'45: eth0

Knowing the interrupt number of the network card, you can query its interrupt affinity configuration "smp_affinity":

Shell > cat / proc/irq/45/smp_affinity02

The 02 here is actually hexadecimal, which represents CPU 1. The calculation method is as follows (Resources):

Binary Hex CPU 0 0001 1 CPU 1 0010 2 CPU 2 0100 4 + CPU 3 1000 8-both 1111 f

Note: if all 4 CPU participate in interrupt processing, set it to f; similarly, set 8 CPU to ff:

Shell > echo ff > / proc/irq/45/smp_affinity

There is also a similar configuration "smp_affinity_list":

Shell > cat / proc/irq/45/smp_affinity_list1

The two configurations are interlinked. If one is modified, the other will change accordingly. However, "smp_affinity_list" uses decimal, which is more readable than "smp_affinity" hexadecimal.

With these basics in mind, we can try a different CPU and see what happens:

Echo 0 > / proc/irq/45/smp_affinity_list

Then through the "top" command observation, you will find that the CPU for handling soft interrupts has become CPU 0.

Note: if you want multiple CPU to participate in interrupt handling, you can use syntax similar to the following:

Echo 3dint 5 > / proc/irq/45/smp_affinity_listecho 0-7 > / proc/irq/45/smp_affinity_list

The bad news is that the "smp_affinity" and "smp_affinity_list" configurations of multiple CPU are invalid for a single queue Nic.

The good news is that Linux supports RPS, and the popular point is to simulate and implement the hardware multi-queue network card function at the software level.

First, let's see how to configure RPS. If the number of CPU is 8, you can set it to ff:

Shell > echo ff > / sys/class/net/eth0/queues/rx-0/rps_cpus

Then configure the kernel parameter rps_sock_flow_entries (recommended by the official document: 32768):

Shell > sysctl net.core.rps_sock_flow_entries=32768

* configure rps_flow_cnt. If you have a single queue ENI, you can set it to rps_sock_flow_entries:

Echo 32768 > / sys/class/net/eth0/queues/rx-0/rps_flow_cnt

Note: if it is a multi-queue Nic, then set it to rps_sock_flow_entries / N according to the number of queues.

After making the above optimization, we run the "top" command to see that the soft interrupt has been dispersed into two CPU:

Shell > topCpu0: 24.8%us, 9.7%sy, 0.0%ni, 52.2%id,... 11.5%si, 1.8%stCpu1: 8.8%us, 5.1%sy, 0.0%ni, 76.5%id,... 7.4%si, 2.2%stCpu2: 17.6%us, 5.1%sy, 0.0%ni, 75.7%id,... 0.7%si 0.7%stCpu3: 11.9%us, 7.0%sy, 0.0%ni, 80.4%id,... 0.7%si, 0.0%stCpu4: 15.4%us, 6.6%sy, 0.0%ni, 75.7%id,... 1.5%si, 0.7%stCpu5: 20.6%us, 6.9%sy, 0.0%ni, 70.2%id,... 1.5%si 0.8%stCpu6: 12.9%us, 5.7%sy, 0.0%ni, 80.0%id,... 0.7%si, 0.7%stCpu7: 15.9%us, 5.1%sy, 0.0%ni, 77.5%id,... 0.7%si, 0.7%st

Doubt: in theory, I have set RPS to ff, so all 8 CPU should share soft interrupts, but there are only two actual results. Please let me know if you know why, but anyway, two is better than one.

In addition, because this is a Nginx server, you can configure which CPU Nginx uses with the "worker_cpu_affinity" command, so that we can bypass the high load of CPU, which can be of some help to performance.

Add: if the server is NUMA architecture, then "numactl-cpubind" may also be useful.

At this point, the study on "how to solve the problem of high soft interrupt on Nginx server" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.