Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to solve the environment crash caused by docker container "java.lang.OutOfMemoryError"

2025-01-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article is about how to solve the environmental collapse caused by the docker container "java.lang.OutOfMemoryError". The editor thinks it is very practical, so I share it with you. I hope you can get something after reading this article.

Problem description:

2019.9.16 at about 2:40 in the afternoon, it was found that the environment was malfunctioning and the function was not functioning properly.

Conduct an investigation immediately.

1. The operation of basic service ports is normal.

2. Check the three newly released microservices in the environment, and find that all of them print this log at different frequencies:

2019-09-16 14 DUBBO 4215 INFO [DubboMonitor.java:80]: [DUBBO] Send statistics to monitor zookeeper://192.168.1.101:2181/com.alibaba.dubbo.monitor.MonitorService?anyhost=true&application=dubbo-monitor&check=false&delay=-1&dubbo=crud&generic=false&interface=com.alibaba.dubbo.monitor.MonitorService&methods=lookup,collect&pid=11&revision=monitors&side=provider × tamp=1568598922300, dubbo version: crud, current host: 10.42.91.223

Because these logs were printed all the time when a microservice appeared OutOfMemoryError, I exported three container logs to view. I just exported two logs. When I was importing the third log, I found that the docker command could not be executed and docker failed.

First restart the docker service, restore the business, and then check the reason why the docker hung up.

Reason analysis: 1. View / var/log/messages log

Filter out the docker-related content in the messages file and find this information (part of the log):

Sep 16 14:43:07 rancher-node dockerd-current: time= "2019-09-16T14:43:07.982713104+08:00" level=error msg= "collecting stats for 587cf4938bed5e3172868d85ae41db3af37e9c1a6cd8192f1cfa22a4e969d53b: rpc error: code = 2 desc = fork/exec / usr/libexec/docker/docker-runc-current: cannot allocate memory:\" Sep 16 14:45:04 rancher-node journal: Suppressed 1116 messages from / system.slice/docker.serviceSep 16 14:45:05 rancher-node dockerd-current: time= "2019-09-16T14:45:05 .410928493 + 08:00 "level=info msg=" Processing signal 'terminated' "Sep 16 14:45:05 rancher-node journal: time=" 2019-09-16T06:45:05Z "level=error msg=" Error processing event & events.Message {Status:\ "kill\" ID:\ "af42628b1354b74d08b195c0064d8c5d760c826626a3ad36501a85c824d2204d\", From:\ "prod.locmn.cn/prod/locmn-drols-query-chq:latest\", Type:\ "container\", Action:\ "kill\",. Error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? "

That is to say, at 14:45, docker can no longer allocate memory.

Then the signal terminates and the docker process is killed, so the docker command cannot be run, and all docker containers hang up together:

Sep 16 14:45:05 rancher-node dockerd-current: time= "2019-09-16T14:45:05.410928493+08:00" level=info msg= "Processing signal 'terminated'" Sep 16 14:45:05 rancher-node journal: time= "2019-09-16T06:45:05Z" level=error msg= "Error processing event & events.Message {Status:\" kill\ ", ID:\" af42628b1354b74d08b195c0064d8c5d760c826626a3ad36501a85c824d2204d\ ", From:\" registry.locman.cn/sefon-online/locman-drools-query-chq:latest\ " Type:\ "container\", Action:\ "kill\", .error: Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running? "

But why can't memory be allocated?

Check the mainframe, there are still 10 GB of memory.

2. View the business log

After a close examination of the business logs of the three newly released microservices, it was found that at 14:03, a service called cud had an error of "java.lang.OutOfMemoryError":

2019-09-16 14 DUBBO 0315 ERROR [ExceptionFilter.java:87]: [DUBBO] Got unchecked and undeclared exception which called by 10.42.83.124. Service: com.run.locman.api.crud.service.AlarmInfoCrudService, method: add, exception: java.lang.OutOfMemoryError: unable to create new native thread, dubbo version: crud Current host: 10.42.91.223java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0 (Native Method). (omitting some log contents) Exception in thread "pool-1-thread-3" java.lang.OutOfMemoryError: unable to create new native thread at java.lang.Thread.start0 (Native Method) at java.lang.Thread.start (Thread.java:714)

It turned out that the failure caused by the memory overflow of the cud service caused the docker service to be dropped by kill, and all the docker containers were instantly hung up!

Summary of the cause of failure

The failure was analyzed with the developer and found that there were two reasons:

1. This service has a thread pool. The minimum set in the code is 8, and the maximum limit is 2147483647. It will take 1 minute for the used thread to be recycled. There are two problems:

First, if the business keeps sending requests, the service will always create threads, and because the maximum value of a given thread is too large, it is equivalent to unlimited creation of threads, which will always consume resources.

Second, the used thread will not be recycled until 1 minute later, which is too long.

Under the influence of these two points, the program runs for a period of time, it will create a large number of threads, excessive consumption of memory resources.

2. Since the docker container does not have the memory limit of the container at the beginning, the resources used by the container are unlimited by default.

That is, the maximum resources allowed by the host kernel scheduler can be used, so when the host finds that there is not enough memory, it will also throw a memory overflow error. And will start killing some processes to free up memory space. The scary thing is that any process can be hunted by the kernel, including docker daemon and other important programs on the host. What is more dangerous is that if an important process that supports the running of the system is dropped by kill, the whole system will crash.

This time the docker service process was killed.

Solution

1. Develop optimized code, including limiting the maximum number of threads in the thread pool and the time it takes to reclaim threads, and reissue the code to make patches. Later, it is observed that the problem of classes no longer appears.

2. Limit docker memory. Re-optimize the docker container, limit the use of docker memory, and reduce the risk of excessive use of host resources by docker containers

3. Strengthen the monitoring and alarm of docker containers.

Summary

1. Docker limits memory, which is very important!

2. The way to limit memory (put a step to modify memory written by someone else):

Method 1: statically modify-m

-m parameter: limit the maximum memory used by the docker container

For example: $docker run-it-m 300m-- memory-swap-1-- name con1 u-stress / bin/bash

The upper limit of memory used by the container is limited to 300m with the-m option in the docker run command above.

At the same time, set the memory-swap value to-1, which indicates that the container program is limited to use memory, while the swap space that can be used is unlimited (the host can use as many swap containers as it can).

Method 2: dynamically modify docker update

Docker update dynamically modifies docker container memory

For example, limit the memory of a container running gitlab to 2048m

Docker update-- memory 2048m-- memory-swap-1 gitlab

The above is how to solve the environmental collapse caused by the docker container "java.lang.OutOfMemoryError". The editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report