What is the core implementation principle of kubernetes controller StatefulSet? 04/22 Update SLTechnology News&Howtos

What is the core implementation principle of kubernetes controller StatefulSet?

2025-04-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "what is the core implementation principle of kubernetes controller StatefulSet". Interested friends may wish to take a look. The method introduced in this paper is simple, fast and practical. Next, let the editor take you to learn "what is the core implementation principle of kubernetes controller StatefulSet?"

1. Basic concept

First, we introduce some basic things to consider in stateful applications, and then we will look at the key implementations of statefulSet in the next chapter.

1.1 stateless and stateless

In the daily development of applications, it can be divided into two categories: stateful and stateless, for example, web services are usually stateless, web application data mainly comes from middleware such as back-end storage and cache, but does not save the data itself; and its data, such as redis and es, is also a part of the application itself, so we can see that the stateful application itself contains two parts: application and data.

1.2 consistency and data

Consistency is a common problem in distributed systems. It is mentioned above that stateful applications contain data, so are data and consistency the same thing? The answer is not necessarily, in applications such as zookeeper, data will be written to most nodes in the cluster through zab protocol, while in applications such as kafka, the consistency design requirements are relatively low, so it can be seen that the consistency of stateful application data is more determined by the system design of the corresponding scenario.

1.3 identity

In some applications, identity is a part of the system itself, such as zookeeper, which affects the election of the final zab protocol through the id of server, and the partition allocation in kafka is also assigned according to the corresponding id.

1.4 monotonous and orderly updates

Usually, partition tolerance should at least be guaranteed in distributed systems to prevent partial node failures from causing the whole system to be unavailable. The management strategy of Pod in statefulset in K8s is to ensure that Pod updates are updated one by one as safely as possible, rather than starting or stopping all Pod in parallel.

1.5 capacity expansion and failover

Horizontal capacity expansion and reduction in K8s are very simple, deleting and adding a Pod, but for stateful applications, they do not know these things, such as how to balance the data after capacity expansion and how to fail over nodes after failure. These are all things that a stateful application needs to consider.

two。 Core implementation

The overall flow of the implementation mechanism of StatefulSet is relatively simple, and then it will be explained in turn according to Pod management, state calculation, state management and update strategy.

2.1 release and adopt of Pod

The names of pod in statefulSet are set according to certain rules, and the name itself has a meaning. When k8s updates statefulset, it will first filter the pod belonging to the current statefulset, and do the following

The association between the controller and Pod in K8s is mainly through two parts: controllerRef and label. When statefulset performs Pod filtering, if it is found that the controllerRef of the corresponding pod is the current statefulset but its label or name does not match, it will try the Pod corresponding to release.

On the other hand, if it is found that the label and name of the corresponding Pod match, but the controllerRef is not the current statefulSet, the corresponding controllerRef will be updated to the current statefulset, which is called adopt.

Through this process, you can ensure that the Pod associated with the current statefulset is either associated with the current object, or I will release you, so that the consistency of the Pod can be maintained, and even if someone modifies the corresponding Pod, it will be adjusted to the final consistency.

2.2 Classification of copies

After the correction of the Pod status in the first step, statefulset traverses all its own Pod and divides the Pod into two categories: valid copy and invalid copy (condemned). As mentioned earlier, the name of the Pod is also ordered, that is, the name of the Pod with N copies is {0...N-1}. Here, the difference between valid and invalid is also based on the corresponding index order. If it exceeds the current copy, it is invalid.

2.3 monotonous updates

Monotonous update mainly means that when the corresponding Pod management strategy is not managed in parallel, as long as any Pod in the current Replicas (valid copy) is created, terminated, or not ready, it will wait for the corresponding Pod to be ready, that is, if you want to update a statefulset Pod, the corresponding Pod must already be RunningAndReady.

Func allowsBurst (set * apps.StatefulSet) bool {return set.Spec.PodManagementPolicy = = apps.ParallelPodManagement}

2.4 rolling updates based on counters

The implementation of rolling update is relatively obscure, which is mainly achieved by controlling the copy count. First, check in reverse order whether the version of the corresponding Pod is very new. If it is not, delete the corresponding Pod directly and subtract the currentReplica count by one. In this way, when you check the corresponding Pod, you will find that the corresponding Pod does not exist, so you need to generate new Pod information for the corresponding Pod. At this point, a newer copy will be used to update.

Func newVersionedStatefulSetPod (currentSet, updateSet * apps.StatefulSet, currentRevision, updateRevision string, ordinal int) * v1.Pod {/ / if it is found that the index of the current Pod is less than the current replica count, it indicates that the current Pod has not been updated, but in fact the Pod template may need to be regenerated for other reasons At this point, the old copy is still configured with if currentSet.Spec.UpdateStrategy.Type = = apps.RollingUpdateStatefulSetStrategyType & & (currentSet.Spec.UpdateStrategy.RollingUpdate = = nil & & ordinal

< int(currentSet.Status.CurrentReplicas)) || (currentSet.Spec.UpdateStrategy.RollingUpdate != nil && ordinal < int(*currentSet.Spec.UpdateStrategy.RollingUpdate.Partition)) { pod := newStatefulSetPod(currentSet, ordinal) setPodRevision(pod, currentRevision) return pod } // 使用新的配置生成新的Pod配置 pod := newStatefulSetPod(updateSet, ordinal) setPodRevision(pod, updateRevision) return pod } 2.5 无效副本的清理无效副本的清理应该主要是发生在对应的statefulset缩容的时候，如果发现对应的副本已经被遗弃，就会直接删除，此处默认也需要遵循单调性原则，即每次都只更新一个副本。 2.6 基于删除的单调性更新 if getPodRevision(replicas[target]) != updateRevision.Name && !isTerminating(replicas[target]) { klog.V(2).Infof("StatefulSet %s/%s terminating Pod %s for update", set.Namespace, set.Name, replicas[target].Name) err := ssc.podControl.DeleteStatefulPod(set, replicas[target]) status.CurrentReplicas-- return &status, err } Pod的版本检测位于对应一致性同步的最后，当代码走到当前位置，则证明当前的statefulSet在满足单调性的情况下，有效副本里面的所有Pod都是RunningAndReady状态了，此时就开始倒序进行版本检查，如果发现版本不一致，就根据当前的partition的数量来决定允许并行更新的数量，在这里删除后，就会触发对应的事件，从而触发下一个调度事件，触发下一轮一致性检查。 2.7 OnDelete策略 if set.Spec.UpdateStrategy.Type == apps.OnDeleteStatefulSetStrategyType { return &status, nil } StatefulSet的更新策略除了RollingUpdate还有一种即OnDelete即必须人工删除对应的 Pod来触发一致性检查，所以针对那些如果想只更新指定索引的statefulset可以尝试该策略，每次只删除对应的索引，这样只有指定的索引会更新为很新的版本。 2.8 状态存储状态存储其实就是我们常说的PVC，在Pod创建和更新的时候，如果发现对应的PVC的不存在则就会根据statefulset里面的配置创建对应的PVC，并更新对应Pod的配置。 3. 有状态应用总结从核心实现分析中可以看出来，有状态应用的实现，实际上核心是基于一致性状态、单调更新、持久化存储的组合，通过一致性状态、单调性更新，保证期望副本的数量的Pod处于RunningAndReady的状态并且保证有序性，同时通过持久化存储来进行数据的保存。

The importance of order, the two common designs in distributed systems are partitions and replicas, in which replicas are mainly to ensure availability, while partitions are mainly for the average distribution of data, both of which are usually allocated according to the nodes in the current cluster. If our nodes are temporarily upgraded offline, the data is saved in the corresponding PVC. After the recovery, you can quickly recover the information of the nodes and rejoin the cluster, so if you develop this similar distributed application, you can hand over the underlying recovery and management to k8s and save the data in PVC, then the application only needs to pay more attention to the cluster management and data distribution of the system, that is, this is also the change brought about by the cloud.

At this point, I believe that everyone on the "kubernetes controller StatefulSet core implementation principle is what" have a deeper understanding, might as well to the actual operation of it! Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.