What are the troubleshooting methods for common anomalies in pod and flannel 07/15 Update SLTechnology News&Howtos

What are the troubleshooting methods for common anomalies in pod and flannel

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly explains "what are the common abnormal troubleshooting methods of pod and flannel". Interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Let's let Xiaobian take you to learn "what are the common abnormal troubleshooting methods for pod and flannel"!

1) pod troubleshooting

In general, the problem lies in the pod itself. We can analyze the positioning problem according to the following steps.

kubectl get pod Check if abnormal podjournalctl -u kubelet -f Check kubelet, if abnormal log exists kubectl logs pod/xxxxx -n kube-system 2) Example troubleshooting CrashLoopBackOff and OOMkilled exceptions

1 View node health

[root@k8s-m1 src]# kubectl get nodeNAME STATUS ROLES AGE VERSIONk8s-c1 Ready 16h v1.14.2k8s-m1 Ready master 17h v1.14.2

2 First check if the pod status is normal

[root@k8s-m1 docker]# kubectl get pod -n kube-system NAME READY STATUS RESTARTS AGEcoredns-fb8b8dccf-5g2cx 1/1 Running 0 2d14hcoredns-fb8b8dccf-c5skq 1/1 Running 0 2d14hetcd-k8s-master 1/1 Running 0 2d14hkube-apiserver-k8s-master 1/1 Running 0 2d14hkube-controller-manager-k8s-master 1/1 Running 0 2d14hkube-flannel-ds-arm64-7cr2b 0/1 CrashLoopBackOff 629 2d12hkube-flannel-ds-arm64-hnsrv 0/1 CrashLoopBackOff 4 2d12hkube-proxy-ldw8m 1/1 Running 0 2d14hkube-proxy-xkfdw 1/1 Running 0 2d14hkube-scheduler-k8s-master 1/1 Running 0 2d 14h Found network plug-in kube-flannel has been trying to restart, sometimes able to work, sometimes prompt CrashLoopBackOff sometimes OOMKilled

3 View kublet logs [root@k8s-m1 src]# journalctl -u kubelet -f1December 09 09:12:45 k8s-m1 kubelet[35667]: E1209 09:12:45.895575 35667 pod_workers.go:190] Error syncing pod 2eaa8ef9-1822-11ea-a1d9-70fd45ac3f1f ("kube-flannel-ds-arm64-7cr2b_kube-system(2eaa8ef9-1822-11ea-a1d9-70fd45ac3f1f)"), skipping: failed to "StartContainer" for "kube-flannel" with CrashLoopBackOff: "Back-off 5m0s restarting failed container=kube-flannel pod=kube-flannel-ds-arm64-7cr2b_kube-system(2eaa8ef9-1822-11ea-a1d9-70fd45ac3f1f)"

4 View logs for kube-flannel

[root@k8s-m1 src]# kubectl logs kube-flannel-ds-arm64-88rjz -n kube-systemE1209 01:20:42.527856 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t nat -C POSTROUTING ! -s 10.244.0.0/16 -d 10.244.0.0/16 -j MASQUERADE --random-fully --wait]: exit status -1: E1209 01:20:46.928502 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t filter -C FORWARD -s 10.244.0.0/16 -j ACCEPT --wait]: exit status -1: E1209 01:20:52.128049 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: running [/sbin/iptables -t filter -C FORWARD -s 10.244.0.0/16 -j ACCEPT --wait]: exit status -1: E1209 01:20:52.932263 1 iptables.go:115] Failed to ensure iptables rules: Error checking rule existence: failed to check rule existence: fork/exec /sbin/iptables: cannot allocate memory At first, I suspected that it was iptables problem. When I tried to copy the command executed in iptables.go to the command line, it could be executed normally. At this time, I didn't know why, until I found that pod would prompt sometimes;

kube-flannel-ds-arm64-hnsrv 0/1 OOMKilled 4 2d 12h has been thinking about whether the kube-flannel memory configuration is too small, I directly try to modify the memory from 50M to 200M, directly solve the problem, as shown below: containers: - name: kube-flannel image: quay.io/coreos/flannel:v0.11.0-amd64 command: - /opt/bin/flanneld args: - --ip-masq - --kube-subnet-mgr resources: requests: cpu: "100m" memory: "50Mi" limits: cpu: "100m" memory: "50Mi" securityContext: privileged: false capabilities: add: ["NET_ADMIN"]3) ImagePullBackOff exception resolution This exception usually occurs for two reasons: The mirror name is invalid-for example, you misspelled the name, or image does not exist You specified a label for image that does not exist 4) The network plug-in kube-flannel does not start

Generally, it is because of the network plug-in flannel download problem. The default network plug-in download address is quay.io/coreos/flannel, but this address cannot be directly accessed by the domestic network. At this time, we need to download from quay-mirror.qiniu.com/coreos/flannel address, and then rename the city quay.io, and then execute

kubectl create -f kube-flannel.yml5) Child node cannot join problem

The main node is installed successfully, and the child node is prompted to join the command. When entering the child node, it is found that it cannot join, or it has been stuck in the shell command line interface and cannot join.

First: first look at the firewall systemctl firewalld.service status because the cluster needs networking communication, if the firewall is open, it is recommended to close or join the iptables. It can be accessed by default.

Second: check whether you configure the host component

Execute cat /etc/hosts to modify the hosts file. Add IP and hostname information of all nodes in the cluster hostnamectl --static set-hostname centos-1 in turn

If it is still not solved, it needs to be analyzed and solved according to the node log. 6) OCI runtime create failedDecember 09 08:56:41 k8s-client1 kubelet[39382]: E1209 08:56:41.691178 39382 kuberuntime_sandbox.go:68] CreatePodSandbox for pod "kube-flannel-ds-arm64-hnsrv_kube-system(2eaafd62-1822-11ea-a1d9-70fd45ac3f1f)" failed: rpc error: code = Unknown desc = failed to start sandbox container for pod "kube-flannel-ds-arm64-hnsrv": Error response from daemon: OCI runtime create failed: systemd cgroup flag passed, but systemd support for managing cgroups is not available: unknown

View the daemon.json file

The file docker failed to run the image because systemd was specified

cat /etc/docker/daemon.json{"registry-mirrors": ["https://registry.docker-cn.co"],"exec-opts": ["native.cgroupdriver=systemd"]}

remove

"exec-opts": ["native.cgroupdriver=systemd"]

Restart docker service

7) Child nodes do not support kubectl get node

The connection to the server localhost:8080 was refused - did you specify the right host or port? The reason for this problem is that the kubectl command needs to be run using kubernetes-admin,

The solution is as follows: copy the file [/etc/kubernetes/admin.conf] in the master node to the same directory as the slave node, and then configure the environment variable as prompted: Your Kubernetes control-plane has been initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config

another solution

echo "export KUBECONFIG=/etc/kubernetes/admin.conf" >> ~/.bash_profile source ~/.bash_profile At this point, I believe that everyone has a deeper understanding of "what are the common abnormal troubleshooting methods for pod and flannel". You may wish to actually operate it! Here is the website, more related content can enter the relevant channels for inquiry, pay attention to us, continue to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.