In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces the example analysis of swarm cluster faults and exceptions in docker, which is very detailed and has a certain reference value. Interested friends must read it!
The details are as follows:
After the last docker swarm cluster failure, we upgraded docker from 17.10.0-ce to the latest stable version of docker 17.12.0-ce.
After 22:00 the night before last, there were sudden CPU fluctuations in the two nodes in the cluster. After the CPU fluctuations, in the dead of night in the early hours of the morning, when the traffic was very low, the whole cluster failed and all the sites on the visiting cluster appeared 502s.After a period of time, it automatically returned to normal.
ECS instance: the percentage of swarm1-node5,CPU alarm occurs at 00:52, with a value of 96.14% and a duration of 0 minutes.
. . .
Yesterday morning, we found that the response of container applications in some nodes was a little slow, so we forcibly restarted these nodes through the Ali Cloud console and returned to normal.
When we updated an application on the cluster this morning (deploying a new image), there was a strange problem. The application is deployed on the manager node swarm1-node1, and the container runs on other nodes after deployment. However, it is strange that only the swarm1-node1 node can normally access the site in the container. Access on other nodes is 503.Using the docker stack rm command to delete the application and redeploy the application is still a problem.
At that time, the two containers of docker-flow-proxy (routing application) were deployed on the swarm1-node1 node. from the point of view of the problem, the communication between the docker-flow-proxy container and the outside world on the swarm1-node1 node was normal, and the overlay network (network A) communication between the docker-flow-proxy container and the containers on other nodes was normal. On other nodes, external requests are normally forwarded to the docker-flow-proxy container through the overlay network (network B), but cannot be routed to the corresponding container on the other node (also through the overlay network A). I really don't understand this strange phenomenon, but the problem lies there, and we have to solve it even if we can't figure it out. If we can't figure out the reason behind it, let's look at it another way. If all the other nodes are abnormal, the swarm1-node1 is normal. According to the rough principle that the minority obeys the majority, then we think that swarm1-node1 is not normal. So take the swarm1-node1 node offline with the following command:
Docker node update-availability drain swarm1-node1
After swarm1-node1 went offline, all other nodes returned to normal, and sure enough, the swarm1-node1 was abnormal.
Behind the offline swarm1-node1, the docker-flow-proxy container is transferred to other nodes to run.
The problem was thus solved by conjecture.
The above is all the contents of the article "sample Analysis of swarm Cluster failures and exceptions in docker". Thank you for reading! Hope to share the content to help you, more related knowledge, welcome to follow the industry information channel!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.