Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to optimize JVM OOM

2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/03 Report--

This article mainly explains "how to optimize JVM OOM". The content of the explanation is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn how to optimize JVM OOM.

Just took over the service, normal and stable operation for a long time, when the big guys packed up to go home for the Spring Festival, suddenly a fit.

The interface failure rate remains high?

Read the journal!

GC overhead limit exceeded

Java.lang.OutOfMemoryError:GC overhead limit exceeded

Take a look at JVM stack

Sudo jmap-heap port

# eg:sudo jmap-heap 9999

Obviously, there is not enough memory.

My first thought at that time was that it should be a memory leak! My thoughts are as follows:

1. Preconceived, because a JVM OOM problem caused by a memory leak was dealt with before, so the memory leak was highly suspected at that time.

2. Export JVM heap data, analyze and locate the problem.

3. Fix bug, redeploy, finish!

Following this line of thinking, I analyzed the stack and found that there were a large number of objects in the heap, which had not yet been recycled, and other threads were requesting memory, resulting in OOM. The direct cause of the problem is the increase in the number of business requests and the insufficient machine resources available. As for the indirect causes, the next article will describe them in detail.

After solving the mystery of the problem, look back and think that if the problem had been analyzed carefully, the problem might have been solved more quickly.

Hindsight: the service has been running steadily and normally for some time, and the code has not been modified or updated within a month. If there is a problem with the code, it is highly likely that the problem will occur within a few days after the new code is launched. Based on this, you can basically troubleshoot code problems.

If there is a problem with the online service, the first task is to restore the service availability as soon as possible. If there is a similar problem next time, I will choose process one instead of process two.

The direct cause of service unavailability is the increase in the number of service requests, and the fundamental reason is that the downstream service load is too high, resulting in micro-service invocation timeout, resulting in a chain reaction.

The following figure shows the approximate process from the user initiating the request to the completion of the response.

The business service interface and algorithm service interface use eureka as the service registry. As a whole, the service uses a relatively simple micro-service architecture.

The service interface is paired with feign to request the algorithmic service interface.

@ FeignClient (value = "image-service")

Public interface ImageService {

@ PostMapping (value = "/ XXX")

String XXX (@ RequestParam ("img_base64") String imgBase64)

}

The above code splices the request parameters when it executes the request.

Image-service/xxx?img_base64=fjsfdgldfgrwdfdmgfdglwefsl

When eureka actually determines the address of the requested service, it will do another splicing process.

127.0.0.1:5000/xxx?img_base64=fjsfdgldfgrwdfdmgfdglwefsl

The processing time of the algorithm service interface is positively related to the size of the image. The larger the picture, the longer the processing time. Because processing images is a relatively time-consuming operation, the interface times out. If the request fails (timeout), feign will retry.

The length of the base64 string of a picture is positively correlated with the size of the picture. The larger the picture, the longer the length of the base64 string. A 306k picture, converted to base64 format, the string length is 429196.

Therefore, processing a normal request consumes a lot of memory. The larger the image, the longer the algorithm takes to process. After the timeout fails, feign tries again, and then fails again, resulting in a vicious circle (fortunately, there is a limit on the number of timeouts, otherwise the consequences will be unimaginable if it goes on recursively).

As shown in the picture, the data of the stack is obtained after JVM OOM. The size of the largest image base64 is 5m.

The default value of the jdk1.8 JVM parameter PretenureSizeThreshold is 2m.

When the base64 string exceeds 2m, it is allocated directly to the old age, which undoubtedly increases the memory pressure of the old age of JVM, resulting in frequent Full GC.

Why use image base64 to transfer pictures?

1. Historical reasons.

2. The development is relatively simple.

How to optimize it?

1. Increase the timeout of feign requests.

2. Improve the configuration of the machine.

3. Put the image base64 into the request body to reduce the memory overhead caused by the stitching of the parameters by the feign framework.

Thank you for your reading, the above is the content of "how to optimize JVM OOM", after the study of this article, I believe you have a deeper understanding of how to optimize JVM OOM, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report