Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to calculate CPU utilization

2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly introduces how to calculate the utilization rate of CPU, which has a certain reference value, and interested friends can refer to it. I hope you will gain a lot after reading this article.

The CPU utilization method we usually use is very misleading and gets worse year after year. So what is CPU utilization? Is it just how busy your CPU is, as indicated by ubiquitous metrics like "% CPU"?

In the top command, you see that 90% of the CPU utilization is as follows:

But what it really means is this:

Stall means that the processor is not running instructions, such as while waiting for memory I _ UBO. The ratio I drew above (between "busy" and "idle") is what I encountered in a real production environment, and your CPU is probably in an "idle" state.

What does all this mean to you? Understanding the idle speed of CPU will directly affect your tuning efforts to reduce code or memory.

So what about the real CPU utilization?

The usual CPU utilization is non-idle time, that is, the time when CPU is not running idle threads (such as idle processes in Windows). Your operating system will normally track it during context switching, but if a non-CPU thread starts running for 100 milliseconds and stops, the kernel will assume that it is also on the non-Idle thread for a later period of time.

In the old time-sharing system, there's nothing wrong with this. The Apollo lunar module navigation system computer calls the idle thread "DUMMY JOB", and engineers use it to measure computer utilization. You can refer to a previous article I wrote (link address: http://www.brendangregg.com/usemethod.html#Apollo).

So what's wrong with it?

CPU is now many times faster than memory, but the time it takes to wait for memory is still counted as CPU time. When you see a high "% CPU" in the top command, you may think it has reached a performance bottleneck, the CPU under the heatsink and fan, but in fact, this is the pot with sticks of memory.

How to tell what CPU is up to?

Use performance Monitoring counters (PMC)-hardware counters that can be viewed with perf or other tool commands. For example, observe the entire system for 10 seconds:

# perf stat-a-sleep 10

Performance counter stats for 'system wide':

641398.723351 task-clock (msec) # 64.116 CPUs utilized (100.00%)

379651 context-switches # 0.592 K/sec (100.00%)

51546 cpu-migrations # 0.080 K/sec (100.00%)

13423039 page-faults # 0.021 M/sec

1433972173374 cycles # 2.236 GHz (75.02%)

Stalled-cycles-frontend

Stalled-cycles-backend

1118336816068 instructions # 0.78 insns per cycle (75.01%)

249644142804 branches # 389.218 M/sec (75.01%)

7791449769 branch-misses # 3.12% of all branches (75.01%)

10.003794539 seconds time elapsed

One of the key indicators here is instructions per cylce (IPC, instructions per CPU cycle), which can show how many instructions are run by each CPU in each CPU cycle. The higher the instruction, the higher the efficiency. In the above example, this value is 0.78, but this does not mean that the CPU utilization is 78%, because the maximum IPC of modern CPU is 4.0 (the new one has reached 5.0), that is, 4-wide. When CPU executes an instruction, a single instruction is divided into several steps, such as fetching instruction, decoding, executing, memory access, writing register, etc. If these commands are executed at most one in a single CPU cycle, it takes five CPU cycles to complete a command, IPC is 0.2. if the instruction pipeline, that is, the CPU of 3~5-wide, is used, then one command can be completed in a perfect CPU cycle. IPC is 1. (translator's note: the author uses CPU clock cycle to express the so-called CPU cycle. In order to avoid being confused with the crystal oscillator clock cycle, I did not translate it into the CPU clock cycle. )

Of course, there are hundreds of other performance counters that you can measure.

If you are in a virtualized environment, guest generally does not have direct access to PMC, depending on whether hypervisor supports it. My recent The PMCsof EC2: Measuring (link address: http://www.brendangregg.com/blog/2017-05-04/the-pmcs-of-ec2.html) IPC shows how Xen-based virtual machines in AWS EC2 use PMC.

Best practic

If your IPC is less than 1. 0, you may encounter memory operation-intensive, and software tuning strategies can reduce memory I / O and enhance memory local access, especially on NUMA systems. The hardware tuning strategy is to use memory, bus, and inline technologies with larger and faster CPU cache. Http://www.smxrlyy.com/ of Sanmenxia Gynecology Hospital

If your IPC > 1. 0, you may be instruction-intensive. You can try to reduce the number of instructions executed, such as eliminating unnecessary work and caching operations, and use the CPU flame diagram (link address: http://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html). In terms of hardware tuning, you can try high-frequency, multi-core, hyperthreaded CPU.

What should performance testing products tell you?

Performance testing tools should show the IPC of each process, or by instruction cycle and idle cycle, such as% INS and% STL. The following figure shows the tiptop command in Linux:

Tiptop-[root]

Tasks: 96 total, 3 displayed screen 0: default

PID [% CPU]% SYS P Mcycle Minstr IPC MISS% BMIS% BUS COMMAND

3897 35.3 28.5 4 274.06 178.23 0.65 0.06 0.00 0.0 java

1319 + 5.5 2.6 6 87.32 125.55 1.44 0.34 0.26 0.0 nm-applet

900 0.9 0.0 6 25.91 55.55 2.14 0.12 0.21 0.0 dbus-daemo

Other misleading reasons for CPU utilization

In addition to the idle cycle of CPU in memory, the reasons for this% CPU indicator error are as follows:

Temperature can also make CPU idle.

Turboboost causes clock frequency change

Clock frequency change caused by SpeedStep

The average utilization of 80% in one minute does not represent 100% burst utilization (similar to network QoS)

Spin lock: CPU is seriously fooling around

Update: is it really wrong for CPU to use frankness?

Since the publication of this article, the discussion of messages has been so heated that there have been hundreds of messages. First of all, thank you for your interest in this topic and for taking the time to read it, but I would like to reply here: I don't care about disk's iowait, and the corresponding tuning measures for memory operation-intensive have been given in this article.

However, is CPU utilization inherently wrong or just misleading? I think it's wrong to need someone to see high CPU utilization as a bottleneck in the processing unit. So is the calculation method of this index technically correct? If CPU cannot be used by any other process during idling, then this is the so-called "use wait" (which sounds paradoxical). In some cases,% CPU as an operating system-level indicator is technically correct but easily misleading. In hyperthreading, the idle cycle can be used by other threads, so% CPU's algorithm takes it into account when it is not actually used. That's not right. What I emphasize in this article is to explain the problem and propose countermeasures, and there are technical problems with this indicator.

Conclusion

CPU utilization has become a misleading indicator: it takes into account the waiting cycle for main memory, which accounts for a lot of modern CPU load. If you use additional metrics, you can figure out what% CPU means, including the number of instructions executed per CPU cycle (IPC). IPC

< 1.0可能意味着你的应用是内存密集型,而IPC >

1.0 may be instruction-intensive. In my previous article, performance monitoring products that showed% CPU should also display PMC measurements and fully explain them so as not to mislead users. For example, they can display% CPU and IPC together, or instruction cycle and idle cycle. With these metrics, developers or managers can choose the right tuning approach in applications and operating systems.

Thank you for reading this article carefully. I hope the article "how to calculate the utilization rate of CPU" shared by the editor will be helpful to everyone. At the same time, I also hope that you will support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report