Example Analysis of iostat in linux 07/12 Update SLTechnology News&Howtos

Example Analysis of iostat in linux

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article mainly introduces the example analysis of iostat in linux, which is very detailed and has certain reference value. Friends who are interested must finish it!

Iostat is mainly used to report central processing unit (CPU) statistics and input / output statistics of the whole system, adapters, tty devices, disks and CD-ROM. The following editor will explain the IOSTAT which is easy to be misread in Linux system.

Iostat (1) is the most basic tool for checking the performance of UNIX O on Linux systems, but it is easy to be misread for those who are familiar with other UNIX systems. For example, on HP-UX, avserv (equivalent to svctm on Linux) is the most important index of Icano, which reflects the performance of the hard disk device. It refers to the time it takes for the avserv O request to be issued from the SCSI layer to return to the SCSI layer after the completion of the avserv O request, excluding the waiting time in the SCSI queue, so avserv reflects the speed of the hard disk device processing IUnip O, also known as disk service time, if the avserv is very large, then there must be something wrong with the hardware. However, the meaning of svctm on Linux is very different. In fact, both the man page of iostat (1) and sar (1) say not to trust svctm, and this indicator will be discarded:

"Warning! Do not trust this field any more. This field will be removed in a future sysstat version."

On Linux, the average elapsed time per iMab O is expressed in await, but it does not reflect the performance of the hard disk device, because await includes not only the time the hard disk device processes the iMab O, but also the time it takes to wait in the queue. The await O request has not been sent to the hard disk device when it is in the queue, that is, the waiting time in the queue is not consumed by the hard disk device, so the await cannot reflect the speed of the hard disk device, and kernel problems such as the Ithumb O scheduler may also cause the await to become larger. So is there any indicator that can measure the performance of hard disk devices? Unfortunately, neither iostat (1) nor sar (1) does, because the / proc/diskstats they rely on does not provide this data. To really understand the output of iostat, you should start with understanding / proc/diskstats.

# cat / proc/diskstats 80 sda 239219 1806 37281259 2513275 904326 88832 50268824 26816609 0 4753060 29329105 81 sda1 338 0 53241 6959 154 0 5496 3724 0 6337 10683 82 sda2 238695 1797 37226458 2504489 620322 88832 50263328 25266599 0 3297988 27770221 816 sdb 1009117 481 1011773 127319 00 00 0 126604 126604 8 17 sdb1 1008792 480 1010929 127078 00 00 0 126363 126363 253 0 dm-0 1005 0 8040 15137 30146 0 241168 2490230 0 30911 2505369 253 1 dm- 1 192791 0 35500457 2376087 359162 0 44095600 22949466 0 2312433 25325563 2532 dm-2 47132 0 1717329 183565 496207 0 5926560 7348763 0 2517753 7532688

/ proc/diskstats has 11 fields, and the following kernel documentation explains what they mean, https://www.kernel.org/doc/Documentation/iostats.txt. I restated them. Note that except for field # 9, they are cumulative values that have been accumulated since the system booted:

(rd_ios) the number of read operations. (rd_merges) the number of merge read operations. If two read operations read adjacent blocks of data, they can be merged into one to improve efficiency. The operation of the merge is usually the responsibility of Imax O scheduler (also known as elevator). (rd_sectors) the number of sectors read. (rd_ticks) the time in milliseconds consumed by the read operation. Each read is timed from _ _ make_request () to end_that_request_last (), including the time it takes to wait in the queue. (wr_ios) the number of write operations. (wr_merges) the number of merge write operations. (wr_sectors) the number of sectors written. (wr_ticks) the time in milliseconds consumed by the write operation. (in_flight) the number of currently unfinished iThanks. This value is added by 1 when the Ipica O request enters the queue, and minus 1 at the end of the Icano request.

Note: it is when the Imax O request is queued, not when it is submitted to the hard disk device. (io_ticks) the natural time (wall-clock time) of the device used to process IWeiO.

Please note the difference between io_ticks and rd_ticks (field # 4) and wr_ticks (field # 8). Rd_ticks and wr_ticks add up the time consumed by each Imax O, because hard disk devices can usually process multiple Imax Os in parallel, so rd_ticks and wr_ticks tend to be larger than natural time. On the other hand, io_ticks said that the device has Imax O (that is, not idle) time, regardless of the number of Imax O, just consider whether there is one or not. In the actual calculation, io_ticks keeps timing when field # 9 (in_flight) is not zero, and io_ticks stops timing when field # 9 (in_flight) is zero. (time_in_queue) weighting value for field # 10 (io_ticks). Field # 10 (io_ticks) is the natural time, regardless of the current number of time_in_queue Os, while the time_in_queue is multiplied by the current number of iUnites (that is, field # 9 in-flight) by the natural time. Although the name of this field is time_in_queue, it is not really just the time in the queue, it also contains the time that the hard disk processed IBO. Iostat uses this field when calculating avgqu-sz.

Iostat (1) is calculated on the basis of / proc/diskstats, because / proc/diskstats does not separate queue wait time from hard disk processing time, so no tool based on it can provide disk service time and queue-related values, respectively.

Note: in the following formula, "Δ" represents the difference between two samples, and "Δ t" represents the sampling period.

R rd_ios/ s: number of reads per second = [Δ rd_ios/ Δ t] r wkB/s: number of read operations per second = [Δ wr_sectors/ Δ t] tps: number of read O per second = [(Δ rd_ios+ delta wr _ ios) / Δ t] rkB/s: kilobytes read per second = [Δ rd_sectors/ delta t] * [512Universe 1024] wkB/s: kilobytes written per second = [Δ wr_sectors/ delta t] * [512Universe 1024] rrqm/s: Number of merge read operations per second = [Δ rd_merges/ Δ t] wrqm/s: number of merge write operations per second = [Δ wr_merges/ Δ t] avgrq-sz: average number of sectors per rd_sectors+ O = [Δ rd_sectors+ Δ wr _ ios] avgqu-sz: average number of time_in_queue/ O requests in the queue = [Δ time_in_queue/ Δ t]

(it would be more appropriate to understand the average number of outstanding Imax O requests. ) await: average time required for each rd_ticks+ O = [Δ rd_ticks+ delta wr _ ticks] / [Δ rd_ios+ delta wr _ ios]

It includes not only the time that the hard disk device processes the kernel O, but also the time it takes to wait in the queue. R_await: average time required for each read operation = [Δ rd_ticks/ delta Rd _ ios]

It includes not only the time of read operation of the hard disk device, but also the time of waiting in the kernel queue. W_await: average time per write operation = [Δ wr_ticks/ delta wr _ ios]

It includes not only the time of write operation of the hard disk device, but also the time of waiting in the kernel queue. % util: busy ratio of the hard disk device = [Δ io_ticks/ delt]

Indicates that the device has a time ratio of Iamp O (that is, not idle), regardless of the number of Imax O, only consider whether there is one. Svctm: indicators that have been discarded are meaningless, svctm= [util/tput]

The proper interpretation of iostat (1) is helpful to analyze the problem correctly, and we discuss it further with the actual case.

About rrqm/s and wrqm/s

As mentioned earlier, if two iUnip O operations occur in adjacent data blocks, they can be merged into one to improve efficiency, and the merging operation is usually the responsibility of iMagano scheduler (also known as elevator).

The following example performs the same stress test on many hard drive devices, only the sdb is faster than other hard drives, but the hard drive models are all the same, why does sdb behave differently?

You can see that the rrqm/s of other hard drives is 0, but sdb is not, so it is more efficient because of the combination of rMB/s and rMB/s. We know that the merging of sdb is responsible for the kernel's IUnip O sys/block/sdb/queue/scheduler (elevator), so we checked the / sys/block/sdb/queue/scheduler of sdb and found that it uses a different scheduler from other hard drives, so the performance is different.

% util and hard disk device saturation

% util indicates that the device has a time ratio of Ihambo (that is, not idle), regardless of the number of Icano, only whether it is available or not. Because modern hard disk devices have the ability to process multiple Imax O requests in parallel, even if% util reaches 100%, it does not mean the device is saturated. To take a simplified example: it takes 0.1s for a hard disk to process a single IZP O, and it is capable of processing 10 IZP O requests at the same time. When the 10 IZA O requests are submitted sequentially, it takes 1 second to complete, and the% util reaches 100% in a sampling period of 1 second; while if the 10 Icando requests are submitted at once, all of them are completed in 0.1s, and the% util is only 10% in the sampling period of 1 second. It can be seen that even if the% util is as high as 100%, the hard drive may still have the capacity to handle more Icano requests, that is, it is not saturated. So is there any indicator in iostat (1) that can measure the saturation of hard disk devices? Unfortunately, no.

How old is the await?

Await is the time consumed by a single await O, including the time it takes for the hard disk device to process it and the time it takes for requests to wait in the kernel queue. Normally, the queue waiting time is negligible. Let's take await as an indicator of hard disk speed. How much is normal?

For SSD, from 0.0x milliseconds to 1.x milliseconds, see the product manual

For mechanical hard drives, you can refer to the calculation method in the following documents:

Http://101.96.10.61/cseweb.ucsd.edu/classes/wi01/cse102/sol2.pdf

Roughly speaking, a mechanical hard disk of 10,000 revolutions is 8.38 milliseconds, including seek time, rotation delay, and transmission time.

In practice, it is necessary to judge whether the await is normal according to the application scenario. If the Icano mode is very random and the load is relatively high, it will cause the magnetic head to run around and take a long time to seek, then the await should be estimated to be larger accordingly. If the await O mode is sequential read and write, and only a single process produces the load, then the seek time and rotation delay are negligible, mainly considering the transmission time, and accordingly the IPUP should be very small, even less than 1 millisecond. In the following example, the await is 7.50ms, which seems small, but considering that this is a dd test, which is a sequential read operation, and only a single task is on the hard disk, the await here should be less than 1 millisecond:

Device: rrqm/s wrqm/s rash s wbank s rsec/s wsec/s avgrq-sz avgqu-sz await svctm% utilsdg 0.00 133.00 0.00 2128.00 16.00 1.00 7.50 7.49 99.60

For the disk array, because of the hardware cache, the write operation is completed without waiting for the disk to be completed, so the service time of the write operation is greatly speeded up, and if the write operation of the disk array is not less than one or two milliseconds, it is considered slow; the read operation is not necessarily, the data not in the cache still needs to be read from the physical hard disk, and the read speed of a single small data block is about the same as that of a single disk.

The above is all the content of the article "sample Analysis of iostat in linux". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.