In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/01 Report--
This article introduces how to understand the StatefulSet application choreography tool Deployment, the content is very detailed, interested friends can refer to, hope to be helpful to you.
I. "stateful" demand
We talked about Deployment as an application orchestration management tool. What functions does it provide for us?
As shown in the following figure:
First of all, it supports defining the expected number of a set of Pod, and Controller will maintain the number of Pod in the expected version and the expected number for us.
Second, it supports the configuration of Pod publishing mode. After configuration, Controller will update the Pod according to the policy we have given. At the same time, in the process of updating, it will ensure that the number of unavailable Pod is within our defined range.
Third, if we encounter problems during the release process, Deployment also supports rolling back and forth with one click.
In a nutshell, Deployment believes that all the same versions of Pod it manages are identical copies. In other words, from Deployment Controller's point of view, all the same versions of Pod, regardless of the application or behavior deployed in it, are exactly the same.
Such a capability is supported and satisfied for stateless applications. What if we encounter some stateful applications?
Demand analysis
For example, some of the requirements shown in the following figure:
None of these needs can be met by Deployment, so the Kubernetes community provides us with a resource called StatefluSet to manage stateful applications.
StatefulSet: a controller for stateful application management
In fact, many stateless applications in the community are also managed through StatefulSet. Through the study of this article, we will understand why we also manage some stateless applications through StatefulSet.
As shown on the right side of the figure above, the Pod in StatefulSet is numbered, starting from 0 until the number of defined replica decreases by one. Each Pod has its own network identity: a hostname, a separate pvc, and pv storage. In this way, different Pod under the same StatefulSet have different network identity and own storage disk, which can well meet the needs of most stateful applications.
As shown on the right side of the above figure:
First, each Pod will have an Order serial number, and the Pod will be created, deleted and updated according to the serial number
Second, configure a headless Service so that each Pod has a unique network identity (hostname)
Third, by configuring the pvc template, namely pvc template, so that each Pod has one or more pv storage disks
Finally, a certain number of grayscale releases are supported. For example, if there are now three copies of StatefulSet, we can specify that only one or two of them can be upgraded, or even three to new versions. Through this way, to achieve the purpose of grayscale upgrade.
II. Use case interpretation of StatefulSet example creation
On the left side of the figure above is a Service configuration. By configuring headless Service, we actually want to achieve the goal: expect the Pod in the StatefulSet to have an independent network identity. The Service name here is called nginx.
On the right side of the figure above is a configuration of StatefulSet, and there is a serviceName called nginx in spec. Use this serviceName to specify which Service the StatefulSet corresponds to.
There are several other familiar fields in this spec, such as selector and template. Selector is a tag selector, and the tag selection logic defined by selector must match that labels in metadata in template contains app: nginx. Define a nginx container in template, the image version of this container is the alpine version, and port 80 is exposed as a web service.
Finally, a volumeMounts is defined in template.spec, which comes not from a Volumes in spec, but from volumeClaimTemplates, that is, the pvc template. We define a pvc name called www-storage in the pvc template. This pvc name will also be written to volumeMounts as a volume name and mounted to the / usr/share/nginx/html directory. In this way, we achieve the requirement that each Pod has a separate pvc and is mounted to the corresponding directory in the container.
Service, StatefulSet statu
After creating the two objects above, we can see that the Service nginx resource has been created successfully through the get command.
At the same time, by looking at the endpoints, you can see that the backend has registered three IP and ports. These three IP correspond to the IP of Pod, and the ports correspond to port 80 configured in the previous spec.
Finally, through get sts (StatefulSet abbreviation) nginx-web. As you can see from the result, there is a column called READY, with a value of 3 to 3. Denominator 3 is the desired number in StatefulSet, while numerator 3 indicates that Pod has reached the desired number of READY states.
Pod, PVC statu
The get pod in the following figure can see that the states of the three Pod are all Running states and have been READY. Its IP is the endpoint address you saw earlier.
Through get pvc, you can see the column name of NAME, with the prefix www-storage, the middle nginx-web, and the suffix an ordinal number. Through the analysis, we can know that www-storage is the name defined in volumeClaimTemplates, the middle is the name defined by StatefulSet, and the sequence number at the end corresponds to the sequence number of Pod, that is, three PVC are bound by three Pod respectively. In this way, different Pod has different PVC;PVC and will bind a corresponding PV to achieve the purpose of binding different PV to different Pod.
The version of Pod
We learned that Deployment uses ReplicaSet to manage the version of Pod and the expected number of Pod, but in StatefulSet, StatefulSet Controller manages the subordinate Pod, so StatefulSet uses the label of Pod to identify the version to which the Pod belongs, which is called controller-revision-hash. This label logo is similar to the Pod template hash injected into Pod by Deployment and StatefulSet.
As shown in the figure above, you can see controller-revision-hash through get pod. The hash here is the first time to create the corresponding template version of Pod, and you can see that the suffix is 677759c9b8. Let's make a note here, then we'll do a Pod upgrade, and then we'll see if controller-revision-hash will change.
Update Mirror
By executing the command in the figure above, you can see that the image in StatefulSet has been updated to the new version of mainline in the StatefulSet configuration below.
View the status of the new version
Using the get pod command to query Revision hash, you can see that the controller-revision-hash after the three Pod has been upgraded to the new Revision hash, and then it has become 7c55499668. From the time of the creation of these three Pod, you can see that the Pod with sequence number 2 is the earliest, followed by sequence numbers 1 and 0. This means that during the upgrade process, the real upgrade order is 2-1-0, and the Pod is gradually upgraded to a new version in reverse order, and the Pod we upgraded reuses the PVC used by Pod before. So the data previously in the PV storage disk will still be mounted to the new Pod.
At the top right of the image above is the data seen in StatefulSet's status. Here are several important fields:
CurrentReplica: indicates the number of current versions
CurrentRevision: indicates the current version number
UpdateReplicas: indicates the number of new versions
UpdateRevision: indicates the current version number to be updated
Of course, you can also see that currentReplica and updateReplica, as well as currentRevision and updateRevision are the same, which means that all Pod has been upgraded to the required version.
Operation demonstration StatefulSet choreography file
First of all, it has been connected to a cluster of Aliyun, which has three nodes.
Now to create a StatefulSet and the corresponding Service, first take a look at the corresponding orchestration file.
As shown in the example in the figure above, the nginx corresponding to Service exposes port 80 to the outside. Metadata in StatefulSet configuration defines name defines mirror information for containers in nginx-web;template; finally, a volumeClaimTemplates is defined as PVC template.
Start creating
After executing the above command, we have successfully created Service and StatefulSet. Through get pod, you can see that the Pod serial number created first is 0; through get pvc, you can see that PVC with serial number 0 has been bound to PV.
At this point, the Pod with sequence number 0 has been created with a status of ContainerCreating.
When the Pod with sequence number 0 is created, start creating the Pod with sequence number 1, and then see that the new PVC has also been successfully created, followed by the Pod with sequence number 2.
You can see that the PVC is created before each Pod is created. After the PVC is created, the Pod binds from the Pending state to the PV, then becomes ContainerCreating, and finally reaches Running.
View statu
Then check the status of the StatefulSet through kubectl get sts nginx-web-o yaml.
As shown in the figure above, the expected number of replicas is 3, the number currently available is 3, and the latest version is reached.
Next, let's take a look at Service and endpoints. We can see that there are three corresponding IP addresses for the Port of Service.
In the case of get pod, you can see that the three Pod correspond to the IP address of the above endpoints.
The result of the above operation is that three PVC and three Pod have reached the desired state, and among the status reported by StatefulSet, replicas and currentReplicas are all three.
Upgrade operation
To repeat, kubectl set image is the fixed way to declare the image; StatefulSet represents the voluntary type; nginx-web is the resource name; nginx=nginx:mainline, the nginx before the equal sign is the container name we defined in template, and the following nginx:mainline is the image version we expect to update.
With the above command, the image in StatefulSet has been successfully updated to the new version.
Taking a look at the status through get pod, nginx-web-1,nginx-web-2 has entered the Running state. The corresponding controller-revision-hash is already a new version. Then nginx-web-0 this Pod, the old Pod has been deleted, and the new Pod is still in Createing status.
Check the status again and see that all Pod are already Running.
Check the StatefulSet information. The currentRevision defined in the status in StatefulSet has been updated to a new version, indicating that all three Pod acquired by StatefulSet have entered the new version.
How to check whether the three Pod also reuse the previous network identity and storage disk?
In fact, the hostname configured by headless Service is only linked to Pod name, so as long as the upgraded Pod name is the same as the old Pod name, you can use the network ID used by Pod before.
With regard to the storage disk, you can see the status of the PVC from the figure above, and their creation time has not changed, or the time when the Pod was first created, so the upgraded Pod still uses the PVC used in the old Pod.
For example, you can look at one of the Pod. There is also a declared volumes in this Pod. The name www-storage-nginx-web-0 in this persistentVolumeClaim corresponds to the PVC with the sequence number 0 seen in the PVC list, which was previously used by the old Pod. During the upgrade, Controller deletes the old Pod and creates a new Pod with the same name, which still reuses the PVC used by the old Pod.
In this way, the network storage can be reused before and after the upgrade.
IV. Management mode of architecture design
StatefulSet may create three types of resources.
The first kind of resource: ControllerRevision
With this resource, StatefulSet can easily manage different versions of template templates.
For example: for example, the nginx mentioned above will create a corresponding ControllerRevision for the first version of template at the beginning of its creation. When the image version is modified, StatefulSet Controller will create a new ControllerRevision. It can be understood that each ControllerRevision corresponds to each version of Template and also corresponds to each version of ControllerRevision hash. In fact, the ControllerRevision hash defined in Pod label is the name of ControllerRevision. Use this resource StatefulSet Controller to manage different versions of template resources.
The second resource: PVC
If volumeClaimTemplates,StatefulSet is defined in StatefulSet, PVC will be created based on this template and PVC will be added to Pod volume before creating Pod.
If the user defines volumeClaimTemplates,StatefulSet in the pvc template of spec, before creating the Pod, create the PVC according to the template and add it to the corresponding volume of the Pod. Of course, you can also not define a pvc template in the spec, so the created Pod will not mount a separate pv.
The third resource: Pod
StatefulSet creates, deletes, and updates Pod sequentially, with each Pod having a unique serial number.
As shown in the figure above, StatefulSet Controller is the three resources of Owned: ControllerRevision, Pod, and PVC.
The difference here is that the current version of StatefulSet only adds OwnerReference to ControllerRevision and Pod, not OwnerReference to PVC. As mentioned in the previous series of articles, resources that own OwnerReference will be associated with cascading delete subordinate resources by default when the managed resource is deleted. Therefore, after deleting StatefulSet by default, both ControllerRevision and Pod created by StatefulSet will be deleted, but PVC will not be cascaded deleted because it is not written to OwnerReference,PVC.
StatefulSet controller
The above picture shows the workflow of the StatefulSet controller. Here is a brief introduction to the entire workflow.
First deal with the changes to StatefulSet and Pod by registering Informer's Event Handler (event handling). In Controller logic, every time you receive a change in StatefulSet or Pod, you find the corresponding StatefulSet and put it in the queue. Immediately after taking it out of the queue for processing, the first thing to do is Update Revision, that is, to check the template in the current StatefulSet to see if there is a corresponding ControllerRevision. If not, template has been updated and Controller will create a new version of Revision and have a new ControllerRevision hash version number.
Then Controller will take out all the version numbers and sort them out according to the serial number. In the process of sorting, if a missing Pod is found, it will be created according to the serial number, and if it is found that there is an extra Pod, it will be deleted according to the serial number. After ensuring that the number of Pod and the Pod sequence number meet the number of Replica, Controller will check to see if the Pod needs to be updated. That is, the difference between these two steps is that Manger pods in order checks to see if all the Pod meets the sequence number, while the latter Update in order checks to see if the desired version of Pod meets the requirements and updates it with the sequence number.
The update process of Update in order is shown in the figure above. In fact, this process is relatively simple, which is to delete Pod. After deleting the Pod, it is actually the next trigger event. After Controller gets the success, it will find that the Pod is missing, and then create the new Pod from the previous step Manger pod in order. After that, Controller will do a Update status, that is, the status message you saw from the command line.
Through such a process, StatefulSet achieves the ability to manage stateful applications.
Capacity expansion simulation
Assume that the StatefulSet initial configuration replicas is 1 and there is a Pod0. So after changing the replicas from 1 to 3, we actually create the Pod1 first. By default, we wait for the Pod1 status READY before creating the Pod2.
From the figure above, you can see that the Pod under each StatefulSet is created starting with the serial number 0. So a StatefulSet whose replicas is N, the Pod sequence number created by it is [0MagneN), 0 is an open curve, and N is a closed curve, that is, when N > 0, the sequence number is 0 to NLMI 1.
Expansion and reduction management strategy
Some students may have doubts: if I don't want to create and delete by serial number, then StatefulSet also supports other creation and deletion logic, which is why some people in the community manage stateless applications through StatefulSet. Its advantage is that it can not only have a unique network identity and network storage, but also expand and reduce capacity in a concurrent way.
There is a field in StatefulSet.spec called the podMangementPolicy field, and the optional policy for this field is OrderedReady and Parallel, which is the former by default.
As in the example we just created, there is no podMangementPolicy defined in spec. Then Controller defaults to OrderedReady as the policy, and then in the case of OrderedReady, the expansion and reduction will be carried out in strict Order order. You must wait until the previous Pod status is Ready before you can expand the next Pod. When reducing the capacity, delete it in reverse order, and delete the serial number from big to small.
For example, in the right side of the figure above, when you expand the capacity from Pod0 to Pod0, Pod1, and Pod2, you must create Pod1 first, and then create Pod2 after Pod1 Ready. In fact, there is another possibility: for example, when creating a Pod1, Pod0 becomes NotReady for some reason, which may be due to the host or the application itself. At this point, Controller will not create a Pod2, so not only the previous Pod of the Pod we created will have to Ready, but all the previous Pod will have to Ready before the next Pod will be created. In the example in the figure above, if you want to create a Pod2, then both Pod0 and Pod1 need ready.
Another strategy is called Parallel, which means parallel scaling up and downscaling. There is no need to wait for the previous Pod to be Ready or deleted before processing the next one.
Release simulation
Assuming that the StatefulSet template1 here corresponds to the logical Revision1, then the three Pod under the StatefulSet are all Revision1 versions. After we modify the template, such as the image, Controller upgrades the Pod one by one in reverse order. In the figure above, you can see that Controller first creates a Revision2, which corresponds to the creation of a resource such as ControllerRevision2, and uses the name of the resource ControllerRevision2 as a new Revision hash. After upgrading Pod2 to a new version, delete Pod0 and Pod1 one by one, and then create Pod0 and Pod1.
Its logic is actually very simple, in the upgrade process, Controller will delete the Pod with the largest sequence number and meet the conditions, then the next time Controller is doing reconcile, it will find that the Pod with this serial number is missing, and then create the Pod according to the new version.
Spec field parsing
First, let's take a look at the first few fields in spec. Replica and Selector are both familiar fields.
Replica is mainly the expected quantity.
Selector is an event selector and must match the conditions defined in spec.template.metadata.labels
Template:Pod template, which defines the basic information template of the Pod to be created
List of VolumeClaimTemplates:PVC templates, if this is defined in spec, PVC will be created before the Pod template Template. After the PVC is created, inject the created PVC name as a volume into the Pod created from the Template.
ServiceName: corresponds to the name of Headless Service. Of course, if someone does not need this feature, they will assign a non-existent value,Controller to Service and will not do verification, so you can write a ServiceName for fake. However, it is recommended that each Service be configured with a Headless Service, regardless of whether the Pod under the StatefulSet requires a network identity or not
PodMangementPolicy:Pod management policy. As mentioned earlier, the optional policies for this field are OrderedReady and Parallel, and the former by default
UpdataStrategy:Pod upgrade strategy. This is a structure, which will be described in more detail below.
RevisionHistoryLimit: retains the limit on the number of historical ControllerRevision (default is 10). It is important to note that for the clear version here, there must be no relevant Pod corresponding to these versions. If there is a Pod in this version, the ControllerRevision cannot be deleted.
Analysis of upgrade Policy Field
On the right side of the figure above, you can see that StatefulSetUpdateStrategy has a type field, and this type defines two types: one is RollingUpdate; and the other is OnDelete.
In fact, RollingUpdate is somewhat similar to the upgrade in Deployment, that is, it is upgraded according to the way of rolling upgrade.
OnDelete is upgraded when it is deleted, which is called disabling active upgrade. Controller does not actively upgrade the surviving Pod, but through OnDelete. For example, there are three old versions of Pod, but the upgrade strategy is OnDelete, so when updating the images in spec, Controller will not upgrade the three Pod to the new version one by one. Instead, when we shrink the Replica, Controller will delete the Pod first. When we expand the capacity next time, Controller will expand the new version of Pod.
In RollingUpdateStatefulSetSetStrategy, you can see that there is a field called Partition. This Partition represents the number of old versions of Pod that are retained during a rolling upgrade. Many students who have just finished StatefulSet may think that this is the number of new grayscale versions, which is wrong.
For example: suppose there is a StatefulSet with a replicas of 10. When we update the version, if the Partition is 8, it does not mean that we want to update 8 Pod to the new version, but that we need to keep 8 Pod as the old version, and only update 2 new versions as grayscale. When Replica is 10, the sequence number of the following Pod is [0re9), so when we configure Partition as 8, we still keep the 8 Pod as the old version, and only [8 Pod 9) enter the new version.
To sum up, assuming replicas=N,Partition=M (M < N), the final old version of Pod is [0Magi M) and the new version of Pod is [MMagne N). Through such a Partition way to achieve the purpose of grayscale upgrade, which is currently not supported by Deployment.
Here is a brief summary for you:
StatefulSet is a common Workload in Kubernetes. Its initial goal is to deploy stateful applications, but it also supports deployment of stateless applications.
Unlike Deployment, StatefulSet directly operates Pod to scale up / release, and is not controlled by other workload like ReplicaSet.
StatefulSet is characterized by supporting exclusive PVC for each Pod, having a unique network identity, and being able to reuse PVC and network identity after the upgrade is released.
On how to understand the StatefulSet application choreography tool Deployment to share here, I hope that the above content can be of some help to you, can learn more knowledge. If you think the article is good, you can share it for more people to see.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.