In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article will explain in detail how to improve service availability in Kubernetes service deployment. The content of the article is of high quality, so the editor will share it with you for reference. I hope you will have some understanding of the relevant knowledge after reading this article.
How can we improve the availability of our deployment services? K8S design itself takes into account the possibility of various failures, and provides some self-healing mechanisms to improve the fault tolerance of the system, but some cases may lead to long-term unavailability and lower service availability indicators. The following will be combined with production experience to provide you with some best practices to maximize service availability.
How to avoid a single point of failure?
K8S is designed to assume that nodes are unreliable. The more nodes there are, the higher the probability of node unavailability due to hardware and software failures, so we usually need to deploy multiple copies to the service and adjust the value of replicas according to the actual situation. If the value is 1, there must be a single point of failure. If it is greater than 1 but all replicas are scheduled to the same node, there will still be a single point of failure. Sometimes we have to consider disasters, such as the unavailability of the entire data center.
So we not only need to have a reasonable number of copies, but also need to schedule these different copies to different topology domains (nodes, availability zones) to break up scheduling to avoid single point of failure, which can be achieved by using Pod anti-compatibility, which can be divided into strong anti-compatibility and weak anti-compatibility. For more information on affinity and anti-affinity, please refer to the official document Affinity and anti-affinity.
Let's first take a look at an example of strong anti-affinity, in which DNS services are forced to break up and schedule to different nodes:
Affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution:-labelSelector: matchExpressions:-key: k8s-app operator: In values:-kube-dns topologyKey: kubernetes.io/hostname
LabelSelector.matchExpressions writes that the service corresponds to key and value of labels in pod, because Pod anti-compatibility is achieved by judging the pod label of replicas.
TopologyKey specifies the anti-affinity topology domain, that is, the key of node label. The kubernetes.io/hostname used here means to avoid pod scheduling to the same node. If you have higher requirements, such as avoiding scheduling to the same availability zone to achieve remote multi-activity, you can use failure-domain.beta.kubernetes.io/zone. It is usually not necessary to avoid scheduling to the same region, because the nodes of the same cluster are all in the same region. If you cross a region, the latency will be very large even if you use Direct Connect, so topologyKey generally does not use failure-domain.beta.kubernetes.io/region.
The anti-affinity condition must be satisfied when requiredDuringSchedulingIgnoredDuringExecution scheduling. If no node meets the condition, it will not be dispatched to any node (Pending).
If you do not use this hard condition, you can use preferredDuringSchedulingIgnoredDuringExecution to instruct the scheduler to satisfy the anti-affinity condition as far as possible, that is, weak anti-affinity. If the condition is not met, as long as the node has sufficient resources, it can still be dispatched to a node, at least not Pending.
Let's look at another example of weak anti-affinity:
Affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution:-weight: 100podAffinityTerm: labelSelector: matchExpressions:-key: k8s-app operator: In values:-kube-dns topologyKey: kubernetes.io/hostname
Have you noticed? It is a little different from strong anti-affinity, oh, there is an extra weight, indicating the weight of this matching condition, while the matching condition is moved under podAffinityTerm.
How to prevent services from being unavailable during node maintenance or upgrade?
Sometimes we need to maintain or upgrade the node, and we need to kubectl drain the node before the operation. When expelling, the Pod on the node will be deleted so that they can drift to other nodes. When the expulsion is finished, the Pod on the node drifts to other nodes, then we can rest assured to operate on the node.
One problem is that expelling nodes is a lossy operation, the principle of expulsion:
Block the node (set to unschedulable to avoid new Pod dispatching).
Delete the Pod on this node.
The ReplicaSet controller detects a reduction in Pod and recreates a Pod to be dispatched to the new node.
This process is to delete and then create, not a rolling update, so during the update process, if all copies of a service are on the expelled node, the service may not be available.
Let's move on to the circumstances under which eviction will make the service unavailable:
There is a single point of failure in the service, and all copies are on the same node. When the node is expelled, the service may be unavailable.
The service does not have a single point of failure, but it just so happens that all the Pod involved in this service are deployed on this batch of expelled nodes, so all the Pod of this service is deleted at the same time, which will also make the service unavailable.
The service does not have a single point of failure, nor is it all deployed to this batch of expelled nodes, but when expelled, part of the Pod of the service is deleted, the service's processing capacity decreases within a short period of time, resulting in service overload, and some requests cannot be processed, which reduces service availability.
For the first point, we can use the anti-affinity mentioned earlier to avoid a single point of failure.
For the second and third points, we can configure PDB (PodDisruptionBudget) to prevent all copies from being deleted at the same time. During expulsion, K8S will "observe" the current available and expected number of copies of nginx, control the Pod deletion rate according to the defined PDB, and wait for Pod to start and ready on other nodes when it reaches the threshold, so as to avoid deleting too many Pod at the same time resulting in service unavailability or reduced availability. Two examples are given below.
Example 1 (ensure that at least 90% of the copies of nginx are available at the time of expulsion):
ApiVersion: policy/v1beta1kind: PodDisruptionBudgetmetadata: name: zk-pdbspec: minAvailable: 90% selector: matchLabels: app: zookeeper
Example 2 (ensure that at most one copy of zookeeper is not available during expulsion, which is equivalent to deleting one by one and waiting for the reconstruction to be completed on other nodes):
ApiVersion: policy/v1beta1kind: PodDisruptionBudgetmetadata: name: zk-pdbspec: maxUnavailable: 1 selector: matchLabels: app: zookeeper how to make the service update smoothly?
After solving the problem of reduced availability caused by the single point of failure of the service and the expulsion of nodes, we also need to consider a scenario that may lead to reduced availability, that is, rolling updates. Why the normal rolling update of the service may also affect the availability of the service? Don't worry, let me explain why.
If there is an inter-service call within the cluster:
When a scrolling update occurs on the server side:
Two awkward situations occur:
The old copy is quickly destroyed, and the node kube-proxy, where the client is located, still schedules the new connection to the old copy before updating the forwarding rules, resulting in a connection exception, which may report "connection refused" (when the process stops, no longer accept new requests) or "no route to host" (the container has been completely destroyed, and the network card and IP no longer exist).
When the new copy starts, kube-proxy, the node where the client resides, quickly watch to the new copy, updates the forwarding rules, and schedules the new connection to the new copy. However, the process in the container starts very slowly (such as the java process like Tomcat), and the port is still in the process of starting. The port is not listening and cannot handle the connection, which also causes a connection exception, which usually reports a "connection refused" error.
In the first case, you can add preStop to the container so that sleep can wait for a period of time before the Pod is actually terminated, and wait for the node kube-proxy of the client to update the forwarding rules, and then actually destroy the container. This ensures that the Pod Terminating can continue to run normally for a period of time. During this period, if a new request is forwarded because the forwarding rules on the client side are not updated in time, the Pod can still process the request normally and avoid the occurrence of connection anomalies. It sounds a little inelegant, but the actual effect is relatively good. There is no silver bullet in the distributed world, so we can only try to find and practice the optimal solution that can solve the problem under the current design situation.
In the second case, you can add a ReadinessProbe to the container to update the Endpoint of the Service after the process in the container is really started, and then the node kube-proxy where the client resides updates the forwarding rules to let the traffic in. This ensures that the traffic will not be forwarded until the Pod is fully ready, thus avoiding the occurrence of link anomalies.
How to match the health check-up?
As we all know, configuring health check to Pod is also a means to improve service availability. Configuring ReadinessProbe (readiness check) can avoid forwarding traffic to Pod; configuration LivenessProbe (survival check) that has not been started completely or has an exception, and allows applications that have bug to cause deadlock or hang to restart to recover. However, if the configuration is not good, it may also lead to other problems. Here are some guiding suggestions based on some experience of stepping on the pit:
Do not use LivenessProbe easily unless you understand the consequences and understand why you need it, refer to Liveness Probes are Dangerous
If you use LivenessProbe, do not set it to the same as ReadinessProbe (failureThreshold is larger)
There are no external dependencies (db, other pod, etc.) in the detection logic to avoid cascade failures caused by jitter.
Business programs should try their best to expose the HTTP probe interface to adapt to health check, and avoid using TCP probe, because when the program hang dies, TCP probe can still pass (TCP's SYN packet detection port is completed in kernel state, the application layer is not aware of it)
So much for sharing about how to improve service availability in Kubernetes service deployment. I hope the above content can be helpful to you and learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.