Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Summarize some reasons for the high CPU sys% of oracle database host.

2025-03-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Database >

Share

Shulou(Shulou.com)06/01 Report--

Today I would like to share a brief summary of some of the reasons for the high CPU sys% of the oracle database host.

In the daily database operation and maintenance, the operating system CPU utilization has always been an appropriate indicator for us to measure the system load. For example, USER% can better feedback the database's use of CPU, and then we can find out the source of high CPU consumption in the database again. Wa% can feedback the percentage of CPU time that IO waits for consumption. When the value of wa is high, it indicates that IO waiting is more serious.

However, when CPU SYS% is abnormally high, we all know that the system kernel consumes a lot of cpu, but often at a loss. In the database operation and maintenance of NGBOSS, we put the problem of high CPU SYS% to the host side, and the host engineer replied that the high CPU sys% was caused by the ORACLE user process, which often made it impossible to continue troubleshooting, so I would like to summarize a few reasons for the sudden increase in sys cpu.

Start with the time command:

Sometimes we use the time command to test the time it takes to execute a command or script, for example, time ps:

You can see the time, which is divided into real,user and sys, where user and sys refer to CPU time. You can see the explanation of time in IBM's Performance and System Tuning document, in which user and sys are explained as follows:

CPU time is divided into user and sys. The User value is the time taken by the program itself and by subroutines of any libraries it calls. The sys value is the time used by the system call (directly or indirectly) called by the program.

So what are the main operations or system calls that generate CPU in the SYS part? The following lists the relatively typical case phenomena in daily operation and maintenance and summarizes the articles of others for reference.

1 > A large number of login connections

For example, in the case of a short-term connection storm, a large number of logins require a new startup process, causing CPU SYS to soar.

Listening to create a new session requires assigning a process, which is the responsibility of CPU sys.

Experiment: the following is a script to simulate multiple logins in the virtual machine test, and the cpu sys% fluctuates greatly with the increase of PROCESS.

As can be seen from the above experiments, with the continuous increase of login sessions, the rise of sys is relatively large. If we encounter the phenomenon of sys cpu soaring in a short period of time in our daily operation and maintenance, it is recommended to check whether there is a large number of connections pouring in as soon as possible.

2 > A large number of concurrent Istroke O operations.

Generally speaking, the CPU operation will not consume too much time, because the main time consumption will be on the devices operated by the Icano operation. For example, when reading files from disk, the main time is on the internal operation of the disk, while the CPU time consumed is only a small part of the response time of the IWeiO operation. However, it is only possible to increase SYS CPU when there are a large number of concurrent Ibank Os.

Case: sys cpu continues to grow to more than 50% in RMDB1

A RMDB1 host cpu exception was found at around 16:00 on October 20, with sys growing to more than 50%.

Looking at the disk situation at that time, we found that the four disks of DISK28, 27, 20 and 21 were extremely busy, and the disk read was very high, averaging about 350m per second.

Check its disk from the database is mainly MD library, which is a small business library, and crm on the same host.

Query MD library wait event found that there is mainly direct path read wait, which is the reason why the disk is busy.

Later, it was determined that it was mainly caused by the following sql, killing the ongoing sql session, fixing the wrong execution plan, the database direct path read waited to disappear, the busy 4 disks returned to normal, and the cpu sys also dropped to the normal range.

3 > high sys cpu caused by GC

GC is the cache sharing of nodes in rac, and the requirement for CPU seems to be very high. In daily operation and maintenance, it is always found that once a node has performance problems in RAC, it is easy to cause a large number of GC waits on another node, and GC is mainly for in-memory operations. For example, gc cr multiblock request is usually caused by full table scan or full index scan. Gc cr multiblock request will cause CPU to schedule and manage memory, which will consume CPU time.

Case: node 1 has high CPU sys due to gc buffer busy acquire caused by high consumption of SQL

The CPU sys increased abnormally at about 11:00 on September 29. According to the analysis of the wait at that time, the waiting of log file sysnc, gc buffer busy acquire and latch free was very high, but log file sysnc and latch free were suspected to be caused by the shortage of cpu resources.

And gc buffer busy acquire clearly points to the exception statement:

The execution time of the query statement at that time ranges from 60 to 120 seconds, and the CPU time ranges from 10 to 20 seconds, which means that 80% of the time is spent on waiting when the statement is executed, but the statement does not produce physical reads. Combined with the session wait events, we can know that 80% of the time is spent on GC-related waits (GC-related waits can lead to high CPU sys usage).

As can be seen from the Segments by Global Cache Buffer Busy in awr, the Global Cache Buffer Busy on the table CS_REC_RECEPTION accounts for 86% of the total database. This proves once again that the GC wait event of the system is indeed caused by a business statement on the table xxxxx.

It turns out that the business table is running on both, including queries and DML statements, thus causing more gc buffer busy acquire waits. The above analysis process is given by the original engineer of ORACLE, and it is inferred that the reason for the soaring CPU sys is due to GC waiting. It is true that the situation improved after the developer evaded the sql on September 29, but it is difficult to find further evidence for the abnormal increase of CPU sys caused by GC waiting. Moreover, cpu sys is not necessarily unusually high when the gc wait is high, and the problem of CPU sys exception still occurred on the same node after that. Each time accompanied by a large number of ICPU sys O operations, it is suspected that there are many reasons for high database CPU sys exception.

In addition to the above reasons, there are also the following reasons caused by the management mechanism of the operating system itself:

4 > process scheduling.

The use of this part of CPU lies in the length of the running queue in the operating system, the longer the running queue (run queue), indicating that the more processes need to be scheduled, the higher the burden on the kernel. But in many cases, the high running queue we see is probably the result of high CPU utilization.

5 > memory management.

For example, the application program applies for memory from the operating system, and the operating system maintains the memory available to the system. Similar to ORACLE, the larger the memory and the more frequent memory management operations, the higher the CPU consumption.

6 > others, including inter-process communication, semaphore processing, some activities within the device driver, and so on.

To sum up, the SYS part of CPU utilization refers to the CPU part used by the operating system kernel (Kernel), that is, the CPU consumed by running kernel code, and the most common is the CPU consumed during system calls (SYS CALL), which may be initiated by the request of the user process.

The cases enumerated in this paper mainly describe the daily problem phenomena, but the specific principles still need to be studied deeply.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Database

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report