Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to troubleshoot the problem of CPU overload

2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

The main content of this article is "how to troubleshoot the problem of CPU overload", interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "how to troubleshoot the problem of CPU overload"!

Probe the root cause of the problem

Through the top instruction, it is found that the current 5511 of thread cup and memory utilization is too high:

Top5511 root 200 16.841g 6.088g 5584 S 47.0 39.2 4011:41 java9550 root 200 2516200 67892 2436 S 0.7 0.4 204:20.40 java21271 root 200 11.579g 0.987g 5056 S 0.7 6.4 46:29.88 java13 root 200 00 0 S 0.3 0.0 0:35.21 ksoftirqd/12128 root 200 9. 833g 1.273g 6656 S 0.3 8.2 33:48.35 java 29464 root 20 0 11.578g 1.030g 5448 S 0.3 6.6 31:42.67 java 31721 root 20 0 157744 2264 1544 R 0.3 0.0 0:00.32 top

Query the details of the current java process, using jinfo pid 5511

Attaching to process ID 5511 Please wait...Debugger attached successfully.Server compiler detected.JVM version is 25.151-b12Java System Properties:java.runtime.name = Java (TM) SE Runtime Environmentjava.vm.version = 25.151-b12sun.boot.library.path = / usr/local/java/jre/lib/amd64java.vendor.url = http://java.oracle.com/java.vm.vendor = Oracle Corporationpath.separator =: file.encoding.pkg = sun.iojava.vm.name = Java HotSpot (TM) 64-Bit Server VMsun.os. Patch.level = unknownsun.java.launcher = SUN_STANDARDuser.country = USuser.dir = / mnt/app/bdcenter-base/bdcenter-service-base-1.0-SNAPSHOTjava.vm.specification.name = Java Virtual Machine SpecificationPID = 5511java.runtime.version = 1.8.0_151-b12java.awt.graphicsenv = sun.awt.X11GraphicsEnvironmentos.arch = amd64java.endorsed.dirs = / usr/local/java/jre/lib/endorsedline.separator = java.io.tmpdir = / tmpjava.vm.specification.vendor = Oracle Corporationos.name = Linuxio.netty.noKeySetOptimization = truesun.jnu.encoding = UTF-8java.library.path = / usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/libspring.beaninfo.ignore = truesun.nio.ch.bugLevel = java.specification.name = Java Platform API Specificationjava.class.version = 52.0sun.management.compiler = HotSpot 64-Bit Tiered Compilersos.version = 3.10.0-693.2.2.el7.x86_64user.home = / rootuser.timezone = Asia/Shanghaicatalina .useNaming = falsejava.awt.printerjob = sun.print.PSPrinterJobfile.encoding = UTF-8@appId = bdcenter-service-basejava.specification.version = 1.8io.netty.recycler.maxCapacityPerThread = 0catalina.home = / tmp/tomcat.7050606280722271893.9025user.name = rootjava.class.path = bdcenter-service-base-1.0-SNAPSHOT.jarjava.vm.specification.version = 1.8sun.arch.data.model = 64sun.java.command = bdcenter-service-base-1.0-SNAPSHOT.jar-- spring.profiles.active=prod 5java.home = / usr/local/java/jreuser.language = enjava.specification.vendor = Oracle Corporationio.netty.noUnsafe = trueawt.toolkit = sun.awt.X11.XToolkitjava.vm.info = mixed modejava.version = 1.8.0_151java.ext.dirs = / usr/local/java/jre/lib/ext:/usr/java/packages/lib/extsun.boot.class.path = / usr/local/java/jre/lib/resources.jar:/usr/local/java/jre/lib/rt.jar : / usr/local/java/jre/lib/sunrsasign.jar:/usr/local/java/jre/lib/jsse.jar:/usr/local/java/jre/lib/jce.jar:/usr/local/java/jre/lib/charsets.jar:/usr/local/java/jre/lib/jfr.jar:/usr/local/java/jre/classesjava.awt.headless = truejava.vendor = Oracle Corporationcatalina.base = / tmp/tomcat.7050606280722271893.9025file.separator = / java.vendor.url.bug = Http://bugreport.sun.com/bugreport/sun.io.unicode.encoding = UnicodeLittlesun.cpu.endian = littlesun.cpu.isalist = VM Flags:Non-default VM flags:-XX:+AggressiveOpts-XX:CICompilerCount=4-XX:CMSInitiatingOccupancyFraction=75-XX:+CMSParallelRemarkEnabled-XX:+DisableExplicitGC-XX:InitialHeapSize=6442450944-XX:MaxDirectMemorySize=1073741824-XX:MaxHeapSize=6442450944-XX:MaxNewSize=697892864-XX:MaxTenuringThreshold=6-XX:MinHeapDeltaBytes=196608-XX:NewSize=697892864-XX:OldPLABSize=16-XX:OldSize=5744558080-XX:+PrintGCApplicationStoppedTime-XX:+PrintGCDetails-XX:+PrintGCTimeStamps-XX:ThreadStackSize=256-XX:+UseBiasedLocking-XX : + UseCMSCompactAtFullCollection-XX:+UseCMSInitiatingOccupancyOnly-XX:+UseCompressedClassPointers-XX:+UseCompressedOops-XX:+UseConcMarkSweepGC-XX:+UseFastAccessorMethods-XX:+UseParNewGC Command line:-Xms6G-Xmx6G-XX:MaxPermSize=256M-XX:MaxDirectMemorySize=1G-Xss256k-XX:+AggressiveOpts-XX:+UseBiasedLocking-XX:+UseFastAccessorMethods-XX:+DisableExplicitGC-XX:+UseParNewGC-XX:+UseConcMarkSweepGC-XX:+CMSParallelRemarkEnabled-XX:+UseCMSCompactAtFullCollection-XX:+UseCMSInitiatingOccupancyOnly-XX:CMSInitiatingOccupancyFraction=75-XX:CMSInitiatingOccupancyFraction=75-XX:+PrintGCApplicationStoppedTime-XX:+PrintGCTimeStamps-XX:+PrintGCDetails

Command line can see the input parameters of the JVM user, and describe the message of the entire jvm.

Check the current gc situation: jstat-gccause 5511

[root@bigdata-service-1:/root] # jstat-gccause 5511 1000 S0 S1 E O M CCS YGC YGCT FGC FGCT GCT LGCC GCC0.00 8.86 61.50 72.16 95.05 91.62 4577 326.410 8 8.855 335.265 Allocation Failure No GC0.00 8.86 64.31 72.16 95.05 91.62 4577 326.410 8 8.855 335. 265 Allocation Failure No GC0.00 8.86 65.04 72.16 95.05 91.62 4577 326.410 8 8.855 335.265 Allocation Failure No GC0.00 8.86 65.17 72.16 95.05 91.62 4577 326.410 8 8.855 335.265 Allocation Failure No GC0.00 8.86 65.26 72.16 95.05 91.62 4577 326.410 8 8.855 335.265 Allocation Failure No GC0 . 00 8.86 65.26 72.16 95.05 91.62 4577 326.410 8 8.855 335.265 Allocation Failure No GC0.00 8.86 67.80 72.16 95.05 91.62 4577 326.410 8 8.855 335.265 Allocation Failure No GC

It is found that gc exercises are frequently performed in the new generation [YGC:gc times, YGCT: gc time], and there is almost no gc in the elderly [FGC:full gc times, FGCT: full gc time]. This may be due to the fact that the design of the new generation is too small, resulting in frequent gc in the new generation.

View the current jvm memory allocation:

Jmap-heap 5511 [root@bigdata-service-1:/root] # jmap-heap 5511Attaching to process ID 5511 Please wait...Debugger attached successfully.Server compiler detected.JVM version is 25.151-b12using parallel threads in the new generation.using thread-local object allocation.Concurrent Mark-Sweep GCHeap Configuration: MinHeapFreeRatio = 40 MaxHeapFreeRatio = 70 MaxHeapSize = 6442450944 (6144.0MB) NewSize = 697892864 (665.5625MB) MaxNewSize = 697892864 (665.5625MB) OldSize = 5744558080 .4375MB) NewRatio = 2 SurvivorRatio = 8 MetaspaceSize = 21807104 (20.796875MB) CompressedClassSpaceSize = 1073741824 (1024.0MB) MaxMetaspaceSize = 17592186044415 MB G1HeapRegionSize = 0 (0.0MB) Heap Usage:New Generation (Eden + 1 Survivor Space): capacity = 628162560 (599.0625MB) used = 210724080 (200.96214294433594MB) free = 417438480 (398.10035705566406MB) 33.54610628178795 usedEden Space : capacity = 558432256 (532.5625MB) used = 203572032 (194.14141845703125MB) free = 354860224 (338.42108154296875MB) 36.45420367694519% usedFrom Space: capacity = 69730304 (66.5MB) used = 7152048 (6.8207244873046875MB) free = 62578256 (59.67927551269531MB) 10.256728552337876% usedTo Space: capacity = 69730304 (66.5MB) used = 0 (0.0MB) free = 69730304 (66.5MB) 0.0% usedconcurrent mark-sweep generation : capacity = 5744558080 (5478.4375MB) used = 4145940376 (3953.876853942871MB) free = 1598617704 (1524.560646057129MB) 72.17161561016022% used32720 interned Strings occupying 4008568 bytes.

From the JVM distribution, it can be seen that the Cenozoic distribution is capacity = 628162560 (599.0625MB), Eden Space:capacity = 558432256 (532.5625MB), and the old age distribution is capacity = 5744558080 (5478.4375MB). From the data analysis, the new generation design is too small, which leads to the frequent gc of the new generation.

Secondly, the new generation of design is too small, resulting in large objects can not be allocated, directly allocated to the old age. As a result, the old era takes up too much space, which takes up the memory space of the whole system. May cause other services to run out of memory.

Manual GC online to view the details of JVM after GC:

[root@bigdata-service-1:/root] # jmap-histo:live 5511 is too big. [root@bigdata-service-1:/root] # jstat-gc 5511 1000 S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT 68096.0 68096.0 7588.1 545344.0 394801.0 5609920.0 4050654.1 87952. 0 83637.7 11392.0 10437.9 4592 327.582 9 8.855 336.43768096.0 68096.0 0.0 0.0 545344.0 0.0 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 46918.8 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327. 582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 46922.8 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 46937.1 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 46937.1 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 46937.1 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 64190.0 5609920. 0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 64190.0 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 64190.0 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 64190.0 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 64190.0 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096. 0 68096.0 0.0 0.0 545344.0 82152.3 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 82152.3 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 82152 . 3 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 82156.4 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 82156.4 5609920.0 4043257.0 87952.0 83637.7 11392. 0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 101737.5 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 101737.5 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 103581.5 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 103581.5 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 103581.5 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 124812.4 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 124812.4 5609920.0 4043257.0 87952.0 83637 . 7 11392.0 10437.9 4592 327.582 9 12.997 340.57868096.0 68096.0 0.0 0.0 545344.0 124814.5 5609920.0 4043257.0 87952.0 83637.7 11392.0 10437.9 4592 327.582 9 12.997 340.578

After manually forcing Full gc, it is found that the memory has not been released, indicating that there may be a memory leak in the current system, so that the object cannot be cleaned and released [OC: old space, OU: old space (KB)]. There is a memory leak, mainly due to the fact that the current object reference has not been released, such as the existence of zombie threads. Indicates that there may be a large number of zombie threads in the current system. And then conduct further investigation.

View the current system thread situation: jstack. You can see the details of the current thread

[root@bigdata-service-1:/root] # jstack 5511 | head-502018-07-20 11:59:27Full thread dump Java HotSpot (TM) 64-Bit Server VM (25.151-b12 mixed mode): "Keep-Alive-Timer" # 19437 daemon prio=8 os_prio=0 tid=0x00007f8d749ce000 nid=0x4995 waiting on condition [0x00007f8bb7b2b000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep (Native Method) at sun.net.www.http.KeepAliveCache.run (KeepAliveCache.java:172) at java.lang.Thread.run (Thread.java:748) "http-nio-9025-exec-201" # 19378 daemon prio=5 os_prio=0 tid=0x00007f8c8e3d3000 nid=0x7ffc waiting on condition [0x00007f8b3a724000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park (Native Method)-parking to wait for (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park (LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await (AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take (LinkedBlockingQueue.java:442) at org.apache.tomcat.util.threads.TaskQueue.take (TaskQueue.java:103) at org.apache.tomcat.util.threads.TaskQueue.take (TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask (ThreadPoolExecutor.java: 1074) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run (TaskThread.java:61) at java.lang.Thread.run (Thread.java:748) "http-nio-9025-exec-197" # 19374 daemon prio=5 os_prio=0 tid=0x00007f8c8e3cc000 nid=0x7ff7 waiting onc ondition [0x00007f8b3a828000] java.lang. Thread.State: WAITING (parking) at sun.misc.Unsafe.park (Native Method)-parking to wait for (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park (LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await (AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take (LinkedBlockingQueue.java:442) at org. Apache.tomcat.util.threads.TaskQueue.take (TaskQueue.java:103) at org.apache.tomcat.util.threads.TaskQueue.take (TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask (ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:624) at org.apache.tomcat.util.threads .TaskThread $WrappingRunnable.run (TaskThread.java:61) at java.lang.Thread.run (Thread.java:748) "http-nio-9025-exec-196" # 19373 daemon prio=5 os_prio=0 tid=0x00007f8c8a06d000 nid=0x7ff3 waiting on condition [0x00007f8b3a8aa000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park (Native Method)-parking to wait for (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks .LockSupport.park (LockSupport.java:175) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await (AbstractQueuedSynchronizer.java:2039) at java.util.concurrent.LinkedBlockingQueue.take (LinkedBlockingQueue.java:442) at org.apache.tomcat.util.threads.TaskQueue.take (TaskQueue.java:103) at org.apache.tomcat.util.threads.TaskQueue.take (TaskQueue.java:31) at java.util.concurrent.ThreadPoolExecutor.getTask ( ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1134)

View the total number of threads on the current system:

Number of threads opened by jvm:

[root@bigdata-service-1:/root] # jstack 5511 | wc-l254132

The current number of connections made by the server to different IP:

[root@bigdata-service-1:/root] # netstat-nat | grep ESTABLISHED | awk'{print$5}'| awk-F:'{print$1}'| sort | uniq-c | sort-rn 10244 10.27.70.185 26 10.27.70.77 26 10.24.237.150 10.81.83.121 7 10.29.148.127 4 10.80.101.244 10.27.4.125 2 127.0.0.1 2 10. 80.112.35 2 10.27.87.167 2 10.25.0.162 1 10.80.112.23 1 106.11.248.20

Load the current heap information:

* * jmap-dump:live,format=b,file=dump.hprof 5511 * *

Heap analysis through Eclipse IDE MAT plug-in

Eclipse import heap information

View the current heap memory leak

The object leakage of the heap can be analyzed that there are a large number of io.netty.util.internal.InternalThreadLocalMap in the heap: "heap" this is the internal thread of netty

View the most occupied objects in the current heap

Through the analysis of heap objects, it can be found that there are a large number of connection pool objects in the current system, which locates the problem and the connection pool is leaked. Causes the system to create a large number of connection pools without releasing them. There was a memory leak.

It can be analyzed that the current system has established a large number of connections, and the connection has not been released, resulting in too many threads opened by the system. Locate the node with the problem by analyzing the ip with the most connections

"elasticsearch [_ client_] [transport_client_boss] [Troup4]" # 166 daemon prio=5 os_prio=0 tid=0x00007f8c9819e800 nid=0x20e3 runnable [0x00007f8c7403d000]-locked (a sun.nio.ch.Util$3)-locked (a java.util.Collections$UnmodifiableSet)-locked (a sun.nio.ch.EPollSelectorImpl) at io.netty.channel.nio.NioEventLoop.select (NioEventLoop.java:753) at io.netty.util.concurrent.SingleThreadEventExecutor$5.run (SingleThreadEventExecutor.java:886) I believe you have a deeper understanding of "how to troubleshoot the problem of CPU overload". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 259

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report