Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to analyze an online fault by jstack and jmap

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to analyze an online fault through jstack and jmap. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

The following is the cpu utilization of the online machine. You can see that since April 8, the cpu utilization has gradually increased over time. Eventually, the cpu utilization reaches 100%, resulting in the unavailability of the online service. After the machine is rebooted, it will be restored.

A brief analysis of the possible problems can be divided into five directions:

1. Code problem of the system itself

two。 Avalanche effect caused by problems in the internal downstream system

3. A sudden increase in the amount of upstream system calls

The problem of 4.http requesting a third party

5. The problem with the machine itself

1. Check the log, do not find the centralized error log, and initially eliminate the code logic processing errors.

two。 First of all, we contacted the internal downstream system to observe their monitoring and found that it was normal together. We can eliminate the impact of downstream system failures on us.

3. Check the call volume of the provider API, and there is no sudden increase compared with 7 days, to eliminate the problem of business side transfer amount.

4. Check the tcp monitoring, and the TCP status is normal. You can eliminate the problem caused by the timeout of the third party in the http request.

5. Check machine monitoring, cpu of all 6 machines is rising, and the situation is the same for each machine. Troubleshoot machine problems.

That is, the problem is not directly located through the above-mentioned methods.

1. Restart 5 machines with serious problems in 6, and resume business first. Keep one on-site to analyze problems.

two。 View the current tomcat thread pid

3. Check the system occupancy of the threads under the pid. Top-Hp 384

4. It is found that pid 4430 4431 4432 4433 threads occupy about 40% of cpu respectively.

5. Convert these pid to hexadecimal, 114e 114f 1150 1151 respectively

6. Download the current java thread stack sudo-u tomcat jstack-l 384 > / 1.txt

7. Query the corresponding threads in 5 and find that they are all caused by gc threads.

8.dump java heap data

Sudo-u tomcat jmap-dump:live,format=b,file=/dump201612271310.dat 384

9. Using MAT to load the heap file, you can see that the javax.crypto.JceSecurity object takes up 95% of the memory space, initially locating the problem.

MAT download address: http://www.eclipse.org/mat/

10. Looking at the reference tree of the class, you can see that the BouncyCastleProvider object holds too much. That is, the way the object is handled in our code is wrong, locating the problem.

One piece of our code is written like this.

This is the function of encryption and decryption. Each time you run encryption and decryption, it will new a BouncyCastleProvider object and drop it in the Cipher.getInstance () method.

Take a look at the implementation of Cipher.getInstance (), which is the underlying code implementation of jdk, traced to the JceSecurity class

After every put, verifyingProviders will remove,verificationResults only put, not remove.

See that verificationResults is a map of static, that is, it belongs to the JceSecurity class.

So each time the encryption and decryption is run, an object will be given to this map put, and this map belongs to the dimension of the class, so it will not be reclaimed by GC. This results in a large number of new objects not being recycled.

Set the object in question to static, one for each class, and will not be created multiple times.

Don't panic when you encounter online problems, first confirm the way to troubleshoot the problem:

View the log

Check the CPU situation

Check the TCP situation

Check out the Java thread, jstack

View the java heap, jmap

Analyze heap files through MAT to find objects that cannot be recycled

This is how the editor shared with you how to analyze an online fault through jstack and jmap. If you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report