Example Analysis of tracking High IO waiting in CentOS system 07/12 Update SLTechnology News&Howtos

Example Analysis of tracking High IO waiting in CentOS system

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly shows you the "Example Analysis of Tracking High IO Waiting in CentOS System". The content is simple and easy to understand and organized. I hope it can help you solve your doubts. Let Xiaobian lead you to study and learn the article "Example Analysis of Tracking High IO Waiting in CentOS System".

The first symptom of a high IO latency problem is usually the average system load. Load Balancer calculations are all based on CPU utilization, i.e. the number of processes using or waiting on the CPU. Of course, on Linux platforms, almost all processes are in an uninterrupted sleep state. The baseline of Load Balancer can be interpreted as, on a machine with a CPU core, the CPU is fully utilized. So, for a 4-core machine, if the average complexity of the system is 4, it means that the machine has enough resources to handle the work it needs to do, but only reluctantly. On the same 4-core system, if the average complexity was 8, then that would mean that the server system needed 8 cores to handle the work, but now there are only 4 cores, so it is already overloaded.

If the system shows a high average load, but CPU system and user utilization is low, then IO wait needs to be observed. On linuc systems, IO wait has a large impact on system load, mainly because one or more cores may be IO by disk or network.

It's one thing to discover that a process is waiting for IO to complete; it's quite another to verify the cause of the high IO wait. Use "iostat-x 1" to display IO for physical storage devices in use:

[username@server~]$ iostat -x 1

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

cciss/c0d0 0.08 5.94 1.28 2.75 17.34 69.52 21.60 0.11 26.82 4.12 1.66

cciss/c0d0p1 0.00 0.00 0.00 0.00 0.00 0.00 5.30 0.00 8.76 5.98 0.00

cciss/c0d0p2 0.00 0.00 0.00 0.00 0.00 0.00 58.45 0.00 7.79 3.21 0.00

cciss/c0d0p3 0.08 5.94 1.28 2.75 17.34 69.52 21.60 0.11 26.82 4.12 1.66

From the above, it is obvious that the device/dev/cciss/c0d0p3 has a long wait time. However, we didn't mount a device; it was actually an LVM device. If you're using LVM for storage, then you should find iostat a bit confusing. LVM uses the device mapper subsystem to map file systems to physical devices, so iostat may display multiple devices, such as/ dev/dm-0 and/ dev/dm-1. The output of "df-h" does not show the device mapper path, but prints the LVM path. The easiest way is to add the option "-N" to the iostat parameter.

[username@server~]$ iostat -xN 1

Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util

vg1-root 0.00 0.00 0.09 3.01 0.85 24.08 8.05 0.08 24.69 1.79 0.55

vg1-home 0.00 0.00 0.05 1.46 0.97 11.69 8.36 0.03 19.89 3.76 0.57

vg1-opt 0.00 0.00 0.03 1.56 0.46 12.48 8.12 0.05 29.89 3.53 0.56

vg1-tmp 0.00 0.00 0.00 0.06 0.00 0.45 8.00 0.00 24.85 4.90 0.03

vg1-usr 0.00 0.00 0.63 1.41 5.85 11.28 8.38 0.07 32.48 3.11 0.63

vg1-var 0.00 0.00 0.55 1.19 9.21 9.54 10.74 0.04 24.10 4.24 0.74

vg1-swaplv 0.00 0.00 0.00 0.00 0.00 0.00 8.00 0.00 3.98 1.88 0.00

For simplicity, crop the output of the iostat command above. Each of the file systems listed shows unacceptable IO wait, observe the data labeled "await" in column 10. In contrast, the file system/usr has a higher await time. Let's analyze the file system first, using the command " fuser -vm /opt " to see which processes are accessing the file system, the list of processes is as follows.

root@server:/root > fuser -vm /opt

USER PID ACCESS COMMAND

/opt: db2fenc1 1067 .... m db2fmp

db2fenc1 1071 .... m db2fmp

db2fenc1 2560 .... m db2fmp

db2fenc1 5221 .... m db2fmp

There are currently 112 DB2 processes accessing the/opt file system on the server, four of which are listed for simplicity. It appears that the cause of the problem has been identified, on the server, the database is configured for faster SAN access, and the operating system can use local disks. You can call the DBA(database administrator) and ask how to configure it this way.

The last group to note is LVM and device mapper. The output of the "Iostat-xN" command shows the logical volume name, but it is a mapping table that can be looked up with the command "ls-lrt / dev /mapper." dm-in the sixth column of the output message corresponds to the device name in iostat.

Sometimes, there is nothing to do at the OS or application level, and there is no choice but to choose the faster disk. Fortunately, the price of fast disk access, such as SAN or SSD, is gradually dropping.

The above is "CentOS system tracking high IO wait example analysis" all the content of this article, thank you for reading! I believe that everyone has a certain understanding, hope to share the content to help everyone, if you still want to learn more knowledge, welcome to pay attention to the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.