In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-04 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
What this article shares with you is about how to analyze Kubernetes StatefulSet. The editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.
The difference between StatefulSet and Deployment
Deployment is used to deploy stateless services, and StatefulSet is used to deploy stateful services.
Specifically, what scenarios need to use StatefulSet? The official recommendation is that if you deploy an application that meets one or more of the following deployment requirements, StatefulSet is recommended.
Stable, unique network identity.
Stable and persistent storage.
Orderly and elegant deployment and scaling.
Delete and stop in an orderly and elegant manner.
Orderly and automatic scrolling updates.
The main reason for stability is to maintain the previous network identity and persistent storage after the occurrence of re-schedule in Pod. The network identity mentioned here includes hostname and the A Record corresponding to the Pod in the DNS in the cluster, which does not guarantee that the IP will remain unchanged after Pod re-schedule. To keep the Pod IP unchanged, we can use the stable Pod hostname custom IPAM to get the fixed Pod IP. With the stable and unique network identification feature of StatefulSet, we can easily achieve the fixed IP requirements of Pod, and then if we use Deployment, it will be much more complicated. You need to consider the parameter control in the process of rolling update (maxSurge, maxUnavailable), the waste of IP caused by each application's IP pool reservation, and so on.
Therefore, I would like to add another StatefulSet usage scenario:
To implement a fixed Pod IP scheme, priority can be given to StatefulSet-based
Best practic
The storage of StatefulSet corresponding to Pod is best created dynamically through StorageClass: each Pod creates a corresponding PVC according to the VolumeClaimTemplate defined in StatefulSet, and then PVS automatically creates the corresponding PV through StorageClass and mounts it to Pod. So in this way, you need to create the corresponding StorageClass in advance. Of course, you can also create the corresponding PV manually by the administrator in advance, as long as you can ensure that the automatically created PVC can match these PV.
For data security, when deleting Pods in StatefulSet or downsizing StatefulSet, Kubernetes does not automatically delete PV corresponding to StatefulSet, and these PV cannot be used by other PVC Bound by default. When you manually delete the PV after confirming that the data is useless, whether the data is deleted depends on the ReclaimPolicy configuration of the PV. Reclaim Policy supports the following three types:
Currently, only NFS and HostPath support Recycle
EBS,GCE PD, Azure Disk,Openstack Cinder supports Delete.
Retain, which means you need to clean it manually.
Recycle, equivalent to rm-rf / thevolume/*
Delete, the default value, depends on the backend storage system to implement it on its own.
Note:
Please carefully delete the PVC corresponding to the StatefulSet, first make sure that the Pods is fully Terminate, and then determine that the data in the Volume is not needed, and then consider deleting the PV. Because deleting the PVC may trigger the automatic deletion of the corresponding PV, and may cause data loss in the volume depending on the recalimPolicy configuration in the StorageClass.
Since stateful applications are deployed, we need to create the corresponding Headless Service ourselves. Note that the Label matches the Label of the Pods in the StatefulSet. Kubernetes will create a corresponding SRV Records for the Headless Service, and all backend Pods,KubeDNS will be selected by Round Robin algorithm.
In Kubernetes 1.8 +, you must make sure that the spec.selector of StatefulSet matches .spec.template.metadata.labels, otherwise it will cause StatefulSet creation to fail. Prior to Kubernetes 1.8, StatefulSet's spec.selector if not specified would be equivalent to .spec.template.metadata.labels by default.
Before downsizing the StatefulSet, you need to make sure that the corresponding Pods is Ready, otherwise, even if you trigger the downsizing operation, Kubernetes will not actually do the downsizing operation.
How to understand stable network identification
The "stable network identity" repeatedly emphasized in StatefulSet mainly refers to the hostname of Pods and the corresponding DNS Records.
The hostname of HostName:StatefulSet 's Pods is generated in this format: $(statefulset name)-$(ordinal), ordinal from 0 to Nmuri 1 (N is the desired number of copies).
When StatefulSet Controller creates a pods, it adds a pod name label:statefulset.kubernetes.io/pod-name to the pod and sets it to the pod name and hostname of Pod.
What's the use of pod name label? We can create a separate Service to match to the specified pod, and then it is convenient for us to debug and so on the pod separately.
DNS Records:
DNS parsing of Headless Service: $(service name). $(namespace). Svc.cluster.local parses to one of the backend Pod via DNS RR. SRV Records only contains the Pods of the corresponding Running and Ready, and the Pods of non-Ready will not be in the corresponding SRV Records.
The DNS resolution of Pod: $(hostname). $(service name). $(namespace) .svc.cluster.local resolves to the Pod of the corresponding hostname.
How to understand stable persistent storage
The name of a PVC,PVC corresponding to each Pod is composed as follows: $(volumeClaimTemplates.name)-$(pod's hostname), which corresponds to the corresponding Pod one to one.
When re-schedule (actually recreate) occurs in Pod, the PV of the Bound of its corresponding PVC will still be automatically mounted to the new Pod.
Kubernetes creates N PVC according to VolumeClaimTemplate (N is the desired number of copies), and PVCs automatically creates PVs based on the specified StorageClass.
When the StatefulSet is deleted through cascading, the corresponding PVCs is not automatically deleted, so the PVC needs to be deleted manually.
When the StatefulSet is deleted by cascading or the corresponding Pods is deleted directly, the corresponding PVs will not be deleted automatically. You need to delete PV manually.
Differences between deployment and scaling from Deployment
When deploying a StatefulSet application with N copies, it is created strictly according to the increasing order of index from 0 to NMel 1, and the next Pod must be created on the premise of the previous Pod Ready.
When deleting a StatefulSet application with N copies, it is strictly deleted according to the descending order of index from NMel 1 to 0, and the next Pod deletion must be the previous Pod shutdown and be deleted completely.
When expanding StatefulSet applications, each new Pod must be the premise of the previous Pod Ready.
When scaling down a StatefulSet application, a Pod must be deleted if it is the previous Pod shutdown and deleted successfully.
Note that the pod.Spec.TerminationGracePeriodSeconds of StatefulSet is not set to 0.
How to deal with Node network anomalies and other situations
Normally, StatefulSet Controller ensures that there will not be multiple StatefulSet Pods with the same network identity under the same namespace in the cluster.
If the above situation occurs in the cluster, it may lead to fatal problems such as failure of the stateful application to work properly, or even data loss.
So under what circumstances will there be multiple StatefulSet Pods with the same network identity under the same namespace? Let's consider the occurrence of network Unreachable in Node:
If you use a version of Kubernetes prior to 1.5, when Node Condition is NetworkUnavailable, node controller will force the removal of these pods objects on this Node from apiserver, and StatefulSet Controller will automatically recreate the Pods of identity on other Ready Nodes. Doing so is actually very risky, which may result in multiple StatefulSet Pods with the same network identity for a period of time, and may cause the stateful application to not work properly. So try not to use StatefulSet in versions prior to Kubernetes 1.5, or you are clearly aware of the risk and ignore it.
If you use the version of Kubernetes 1.5 +, when Node Condition is NetworkUnavailable, node controller does not force the removal of these pods objects on this Node from apiserver. The state of these pods is marked as Terminating or Unknown in apiserver, so StatefulSet Controller will not recreate identity Pods on other Node. When you determine that the StatefulSet Pods shutdown on this Node may not be different from the other Pods networks of the StatefulSet, then you need to force the deletion of these unreachable pods object in the apiserver, and then StatefulSet Controller can recreate the Pods of the identity on the other Ready Nodes, so that the StatefulSet continues to work healthily.
So in Kubernetes 1.5 +, how do you force the StatefulSet pods to be removed from apiserver? There are three ways:
If the Node is permanently unable to connect to the network or shut down, it means that the Pods on this Node can no longer communicate with other Pods, which will not affect the availability of the StatefulSet application. It is recommended that the Node,Kubernetes of the NetworkUnavailable manually deleted from the apiserver will automatically delete the Pods object above it from the apiserver.
If the Node is caused by a brain crack in the cluster network, it is recommended to check for network problems and recover successfully. Because Pods state is already Terminating or Unkown, kubelet will automatically delete these Pods after obtaining this information from apiserver.
Delete these Pods directly from apiserver in other cases, because you cannot determine whether the corresponding Pods has already been shutdown or has no effect on StatefulSet application. Forced deletion may result in multiple StatefulSet Pods with the same network identity under the same namespace, so try not to use this method.
Kubectl delete pods-grace-period=0-force
Tips: currently, there are 6 kinds of Node Condition:
Pod management strategy for Node ConditionDescriptionOutOfDiskTrue if there is insufficient free space on the node for adding new pods, otherwise FalseReadyTrue if the node is healthy and ready to accept pods, False if the node is not healthy and is not accepting pods, and Unknown if the node controller has not heard from the node in the last 40 secondsMemoryPressureTrue if pressure exists on the node memory-that is, if the node memory is low; otherwise FalseDiskPressureTrue if pressure exists on the disk size-that is, if the disk capacity is low; otherwise FalseNetworkUnavailableTrue if the network for the node is not correctly configured, otherwise FalseConfigOKTrue if the kubelet is correctly configured, otherwise FalseStatefulSet
The Pod Management Policy configuration is supported by Kubernetes 1.7 starting to support the following two configurations:
OrderedReady,StatefulSet 's Pod default management policy, which is to deploy, delete, and scale one by one, sequentially, is also the default policy.
Parallel, which supports parallel creation or deletion of all Pods under the same StatefulSet, does not wait one by one and sequentially for the previous operation to ensure success before processing the next Pod. In fact, there are very few scenarios in which this management strategy is used.
Update strategy of StatefulSet
StatefulSet's update policy (specified by .spec.updatecontaingy.type) supports the following two types:
OnDelete, meaning the same as Deployment's OnDelete strategy, we should be very familiar with, not much introduction.
RollingUpdate, the rolling update process is roughly the same as Deployment, with the following differences:
All Pods with ordinal greater than or equal to the value specified by partition will be scrolling updated.
All worthwhile ordinal less than partition specified Pods will remain the same. Even if these Pods are recreate, they will be created according to the original pod template and will not be updated to the latest version.
Specifically, if the value of partition is greater than the expected number of copies of StatefulSet N, then no rolling updates of Pods will be triggered.
Equivalent to maxSurge=0,maxUnavailable=1 of Deployment (in fact, these two configurations do not exist in StatefulSet)
The process of scrolling updates is orderly (in reverse order), index is carried out one by one from Nmuri 1 to 0, and the next Pod must be created on the premise of the previous Pod Ready, and the next Pod must be deleted on the premise of the previous Pod shutdown and deleted completely.
Part of the instance can be scrolled and some will not be updated. Specify an index demarcation point through .spec.updategy.rollingUpdate.partition.
The above is how to analyze Kubernetes StatefulSet, the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.