Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to repair Kubernetes cluster certificates after they are all deleted

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article introduces the knowledge about "how to repair Kubernetes cluster certificates after they are all deleted." In the actual case operation process, many people will encounter such difficulties. Next, let Xiaobian lead you to learn how to deal with these situations! I hope you can read carefully and learn something!

Kubernetes is a very powerful platform, Kubernetes architecture allows you to easily deal with all kinds of failures, today we will destroy our cluster, delete certificates, and then find ways to restore our cluster, these dangerous operations without causing downtime to the already running services.

If you really want to perform the next operation, it is recommended not to toss around in the production environment, although in theory it will not cause service downtime, but if there is a problem, don't scold me ~~~

We know that Kubernetes 'control plane consists of several components:

etcd: Used as a database for the entire cluster

kube-apiserver: API services for clustering

kube-controller-manager: Control operations for the entire cluster resources

kube-scheduler: core scheduler

kubelet: is the component that runs on the node and is used to actually manage containers

These components are protected by a set of TLS certificates for client and server, used for authentication and authorization between components, and in most cases they are not stored directly in Kubernetes databases, but in the form of ordinary files.

# tree /etc/kubernetes/pki//etc/kubernetes/pki/├── apiserver.crt├── apiserver-etcd-client.crt├── apiserver-etcd-client.key├── apiserver.key├── apiserver-kubelet-client.crt├── apiserver-kubelet-client.key├── ca.crt├── ca.key├── CTNCA.pem├── etcd│ ├── ca.crt│ ├── ca.key│ ├── healthcheck-client.crt│ ├── healthcheck-client.key│ ├── peer.crt│ ├── peer.key│ ├── server.crt│ └── server.key├── front-proxy-ca.crt├── front-proxy-ca.key├── front-proxy-client.crt├── front-proxy-client.key├── sa.key└── sa.pub

The components of the control panel run as static pods (clusters I'm using kubeadm here) on the master node, and the default resource manifest is located in the/etc/kubernetes/manifests directory. Typically, these components communicate with each other, and the basic flow is as follows:

In order for components to communicate, they need to use TLS certificates. Assuming we have a deployed cluster, let's start our destructive behavior.

rm -rf /etc/kubernetes/

On the master node, this directory contains:

A set of certificates and CA for etcd (in/etc/kubernetes/pki/etcd directory)

A set of kubernetes certificates and CA (in/etc/kubernetes/pki directory)

Kube-controller-manager, kube-scheduler, cluster-admin, and kubelet are also used kubeconfig files

Static Pod resource manifest files for etcd, kube-apiserver, kube-scheduler, and kube-controller-manager (located in/etc/kubernetes/manifolds directory)

Now we have deleted all of the above. If you did this in the production environment, you may be shivering now ~

Repair Control Plane

First of all I also made sure that all our control plane pods had stopped.

#If you use docker, it's OK too. crictl rm `crictl ps -aq`

Note: kubeadm does not overwrite existing certificates and kubeconfigures by default, and in order to reissue certificates, you must first manually delete the old certificates.

Next, we first restore etcd and execute the following command to generate the certificate of etcd cluster:

kubeadm init phase certs etcd-ca

The above command will generate a new CA for our etcd cluster, and since all other certificates must be signed by it, we will copy it and the private key to the other master nodes as well (if you are a multimaster).

/etc/kubernetes/pki/etcd/ca. {key,crt}

Next let's regenerate the rest of the etcd certificates and static resource lists for it on all master nodes.

kubeadm init phase certs etcd-healthcheck-clientkubeadm init phase certs etcd-peerkubeadm init phase certs etcd-serverkubeadm init phase etcd local

After executing the above command, you should have a working etcd cluster.

# crictl psCONTAINER ID IMAGE CREATED STATE NAME ATTEMPT POD IDac82b4ed5d83a 0369cf4303ffd 2 seconds ago Running etcd 0 bc8b4d568751b

Next we do the same for the Kubernetes service, executing the following command on one of the master nodes:

kubeadm init phase certs allkubeadm init phase kubeconfig allkubeadm init phase control-plane allcp -f /etc/kubernetes/admin.conf ~/.kube/config

The above command will generate all SSL Certificates for Kubernetes, as well as static Pods manifests and kubeconfigures files for Kubernetes services.

If you use kubeadm to join kubelet, you also need to update the cluster-info configuration in the kube-public namespace because it still contains the hash value of your old CA.

kubeadm init phase bootstrap-token

Since all certificates on the other master nodes must also be signed by a single CA, we copy them to the other control plane nodes and repeat the above command on each node.

/etc/kubernetes/pki/{ca,front-proxy-ca}. {key,crt}/etc/kubernetes/pki/sa. {key,pub}

By the way, as an alternative to copying certificates manually, you can also use the Kubernetes API with commands like this:

kubeadm init phase upload-certs --upload-certs

This command will encrypt and upload the certificate to Kubernetes in 2 hours, so you can register the master node as follows:

kubeadm join phase control-plane-prepare all kubernetes-apiserver:6443 --control-plane --token cs0etm.ua7fbmwuf1jz946l --discovery-token-ca-cert-hash sha256:555f6ececd4721fed0269d27a5c7f1c6d7ef4614157a18e56ed9a1fd031a3ab8 --certificate-key 385655ee0ab98d2441ba8038b4e8d03184df1806733eac131511891d1096be73kubeadm join phase control-plane-join all

Note that the Kubernetes API also has a configuration that holds CA certificates for front-proxy clients, which are used to authenticate requests from apiservers to webhooks and aggregation layer services. Kube-apiserver updates it automatically. At this stage, we have a complete control plane.

Repair Work Node

Now we can list all nodes of the cluster using the following command:

kubectl get nodes

Of course normally now all nodes are NotReady because they are still using the old certificates, to fix this we will use kubeadm to perform rejoin of cluster nodes.

systemctl stop kubeletrm -rf /var/lib/kubelet/pki/ /etc/kubernetes/kubelet.confkubeadm init phase kubeconfig kubeletkubeadm init phase kubelet-start

But to join the worker node, we must generate a new token.

kubeadm token create --print-join-command

Then execute the following commands on the worker node:

systemctl stop kubeletrm -rf /var/lib/kubelet/pki/ /etc/kubernetes/pki/ /etc/kubernetes/kubelet.conf kubeadm join phase kubelet-start kubernetes-apiserver:6443 --token cs0etm.ua7fbmwuf1jz946l --discovery-token-ca-cert-hash sha256:555f6ececd4721fed0269d27a5c7f1c6d7ef4614157a18e56ed9a1fd031a3ab8

Note: You don't need to delete the/etc/kubernetes/pki directory on the master node because it already contains all the required certificates.

The above operation will rejoin all your kubelets to the cluster, it will not affect any containers already running on it, but if there are multiple nodes in the cluster and it is not done simultaneously, you may encounter a situation where the kube-controller-handler starts recreating containers from NotReady nodes and tries to reschedule them on active nodes.

To prevent this, we can temporarily disable the controller-manager on the master node.

rm /etc/kubernetes/manifests/kube-controller-manager.yamlcrictl rmp `crictl ps --name kube-controller-manager -q`

Once all nodes in the cluster have been joined, you can generate a static resource list for the controller-manager by running the following command on all master nodes:

kubeadm init phase control-plane controller-manager

If kubelet is configured to request certificates signed by your CA (option serverTLSBootstrap: true), you also need to approve CSR from kubelet:

kubectl get csrkubectl certificate approve Repair ServiceAccounts

Because we lost/etc/kubernetes/pki/sa.key, which is used to sign jwt tokens for all ServiceAccounts in the cluster, we have to recreate tokens for each sa.

This can be done by removing the token field from Secret of type kubernetes.io/service-account-token

kubectl get secret --all-namespaces | awk '/kubernetes.io\/service-account-token/ { print "kubectl patch secret -n " $1 " " $2 " -p {\\\"data\\\":{\\\"token\\\":null}}"}' | sh -x

After deletion, kube-controller-manager automatically generates a new token signed with the new key. However, it should be noted that not all microservices can update tokens instantly, so it is likely that containers using tokens will need to be manually restarted.

kubectl get pod --field-selector 'spec.serviceAccountName!= default' --no-headers --all-namespaces | awk '{print "kubectl delete pod -n " $1 " " $2 " --wait=false --grace-period=0"}'

For example, this command will generate a list of commands that will delete all pods that use non-default serviceAccounts, which I recommend executing from the kube-system namespace because kube-proxy and CNI plugins are installed in this namespace and are critical for handling communication between your microservices.

At this point our cluster recovery is complete.

"Kubernetes cluster certificates have been deleted after how to repair" content is introduced here, thank you for reading. If you want to know more about industry-related knowledge, you can pay attention to the website. Xiaobian will output more high-quality practical articles for everyone!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report