The method of using iostat to check the IO performance of linux hard disk 07/02 Update SLTechnology News&Howtos

The method of using iostat to check the IO performance of linux hard disk

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

This article mainly explains the method of using iostat to view the IO performance of linux hard disk, the content is clear, interested friends can learn, I believe it will be helpful after reading.

TOP observation: the percentage of CPU time occupied by IO waiting, when the IO pressure is higher than 30%, followed by using iostat-x 1 10

[root@controller] # iostat-d-k 1 10Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtnsda 19.00 0.00 112.00 0 112sda1 0.00 0.00 0.00 0sda2 0.00 0.00 0sda3 0.00 0.00 0sda4 0.00 0.00 0.00 0 0sda5 3.00 0.00 16.00 0 16sda6 0.00 0.00 0.00 0 0sda7 16.00 0.00 96.00 0 96

Tps: the number of transmissions per second of the device. One transfer means "one Ipaco request".

KB_read/s: amount of data read from the device per second kB_wrtn/s: amount of data written to the device per second kB_read: total amount of data read kB_wrtn: total amount of data written

Use-x for more information

View device usage (% util), response time (await)

[root@controller] # iostat-d-x-k 1 10Device: rrqm/s wrqm/s rash s rkB/s wkB/s avgrq-sz avgqu-sz await svctm% utilsda 0.00 22.00 0.00 18.00 0.00 160.00 17.78 3.78 3.78 6.80sda1 0.00 0.00 0.00. 00 0.00 0.00 0.00sda2 0.00 0.00 0.00 0.00sda3 0.00 15.00 0.00 2.00 0.00 68.00 68.00 0.01 6.50 6.50 1.30sda4 0.00 0.00 0.00 0. 00 0.00 0.00 0.00 0.00sda5 0.00 0.00 0.00 0.00sda6 0.00 0.00 0.00 0.00sda7 0.00 7.00 0.00 16 .00 0.00 92.00 11.50 0.06 3.44 3.44 5.50rrqm/s: the number of merge reads per second. That is, delta (rmerge) / swrqm/s: the number of merge writes per second. That is, delta (wmerge) / sr/s: the number of times the read of the I _ max O device is completed per second. That is, delta (rio) / sw/s: the number of writes completed by the I _ sign O device per second. That is, delta (wio) / srsec/s: read sectors per second. Delta (rsect) / swsec/s: the number of sectors written per second. That is, delta (wsect) / srkB/s: read K bytes per second. Is half the rsect/s because the size of each sector is 512 bytes. (need to calculate) wkB/s: write K bytes per second. It's half of wsect/s. (need to be calculated) avgrq-sz: the average data size (sector) per device Imax O operation. Delta (rsect+wsect) / delta (rio+wio) avgqu-sz: the average length of the queue. That's delta (aveq) / aveq 1000 (because it's in milliseconds). Await: the average wait time (in milliseconds) for each device Istroke O operation. That is, delta (ruse+wuse) / delta (rio+wio) svctm: the average service time (in milliseconds) for each device Icano operation. That is, delta (use) / delta (rio+wio)% util: how many percent of the time in a second is spent on the Icano operation, or how much time in a second the Icano queue is not empty. That is, delta (use) / use 1000 (because it is in milliseconds)

If% util is close to 100%, it means that too many requests have been generated and the system is fully loaded. The disk

There may be bottlenecks.

When idle is less than 70%IO, the pressure is larger. Generally, the reading speed has more wait.

At the same time, you can view the b parameter () and the wa parameter () with vmstat.

In addition, you can also refer to

Svctm is generally less than await (because the waiting time of requests waiting at the same time is calculated repeatedly), the size of svctm is generally related to disk performance, the load of CPU/ memory will also affect it, and too many requests will indirectly lead to the increase of svctm. The size of the await generally depends on the service time (svctm) as well as the length of the Imax O queue and the mode in which the Imax O request is issued. If svctm is close to await, it means that there is almost no waiting time for await; if await is much larger than svctm, the queue is too long, and the response time of the application becomes slower. If the response time exceeds the range allowed by users, you can consider replacing faster disks, adjusting kernel elevator algorithm, optimizing applications, or upgrading CPU.

Queue length (avgqu-sz) can also be used as an index to measure the load of the system, but because the avgqu-sz is according to the average per unit time, it can not reflect the instantaneous flood.

A good example of others. (vs O system. Supermarket queue)

For example, when we wait in line for checkout in the supermarket, how do we decide which cashier to go to? The first thing to do is to look at the number of people in the queue. Five people are faster than 20 people, right? In addition to counting the heads, we often look at the number of things purchased by the people in front of us. If there is an aunt who has been shopping for food for a week, then we can consider changing the queue. And then there is the speed of the cashier. If you meet a novice who doesn't even know the money, you will have to wait. In addition, timing is also important. the cashier, which was full five minutes ago, is now empty, and it's nice to pay at this time, of course, provided that what you have done in the past five minutes is more meaningful than waiting in line (but I haven't found anything more boring than waiting in line).

The Icano system also has many similarities with supermarket queues:

R/s+w/s is similar to the total number of payers average queue length (avgqu-sz) is similar to the average number of people queued per unit time (svctm) is similar to the cashier's collection speed average waiting time (await) is similar to the average waiting time per person (avgrq-sz) is similar to the average number of things bought per person the operating rate (% util) is similar. The percentage of time someone queued in front of the cashier.

Based on these data, we can analyze the pattern of the Ipicuro request, as well as the speed and response time of the Ithumb O.

% util: all processing IO times during the statistical time divided by the total statistical time. For example, if the statistical interval is 1 second, the device has 0.8 seconds of processing IO and 0.2 seconds of idle, then the% util of the device = 0.8 shock 1 = 80%, so this parameter indicates how busy the device is. Generally speaking, if this parameter is 100%, it means that the device is running at nearly full capacity (of course, if it is multiple disks, even if the% util is 100%, because of the concurrency of disks, disk usage may not be a bottleneck).

)

When deploying a program (I am testing a program that uploads logs in real time), we should consider the cpu, memory, io and so on of the system to ensure that the system runs efficiently.

If the package handled by the program itself is very small, there are a lot of events, the pressure is high and there is no interval, it will take up a lot of CPU resources.

If you use disk cache instead of memory cache, you can support breakpoint retransmission to ensure the reliability of data upload, such as a sudden power outage, and the data stored in the disk cache will still be uploaded after recovery and will not be lost. however, it will also increase the number of times to read and write to the disk, and the speed is tolerable if the amount of data is small.

The following is an analysis of the output of this parameter written by others

# iostat-x 1avg-cpu:% user% nice% sys% idle16.24 0.00 4.31 79.44Device: rrqm/s wrqm/s rUnip s rsec/s wsec/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm% util/dev/cciss/c0d0 0.00 44.90 1.02 27.55 8.16 579.59 4.08 289.80 20.57 22.35 78.21 5.00 14.29 / Dev/cciss/c0d0p1 0.00 44.90 1.02 27.55 8.16 579.59 4.08 289.80 20.57 22.35 78.21 5.00 14.29/dev/cciss/c0d0p2 0.00 0.00 0.00

The above iostat output shows that there are 28.57 device I w:r=27:1 O operations per second: total IO (io) / s=r/s (read) + wbind s (write) = 1.02 percent 27.55 percent 28.57 (times per second), where the write occupies the main body (write).

On average, only 5ms is needed for each device Icano operation, but each Icano request needs to wait for 78ms. Why? Because there are too many Icano requests (about 29 requests per second), assuming these requests are made at the same time, the average wait time can be calculated as follows:

Average waiting time = single Ithumb O service time * (1 minute 2 +... + Total number of requests-1) / total number of requests

Applied to the above example: average waiting time = 5ms * (1mm 2 +... + 28) / 29=70ms, which is very close to the average wait time of 78ms given by iostat. This in turn indicates that Imax O was initiated at the same time.

There are a large number of Iamp O requests per second (about 29), but the average queue is not long (only about 2), which indicates that the arrival of these 29 requests is uneven, and most of the time Iamp O is idle.

14.29% of the time in a second, there were requests in the Iamp O queue, that is, 85.71% of the time there was nothing to do by the Ipicuro system, and all 29 Ipicuro requests were processed within 142 milliseconds.

Delta (ruse+wuse) / delta (io) = await=78.21= > delta (ruse+wuse) / s=78.21*delta (io) / accounts 78.21 * 28.57 requests 2232.8, indicating that the total number of 2232.8ms requests per second is required. So the average queue length should be 2232.8ms/1000ms=2.23, while the average queue length (avgqu-sz) given by iostat is 22.35, because there is a bug,avgqu-sz value in iostat that should be 2.23 instead of 22.35.

After reading the above content, do you have a further understanding of how to use iostat to view the IO performance of linux hard drives? if you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.