How to troubleshoot Kubernetes 07/11 Update SLTechnology News&Howtos

How to troubleshoot Kubernetes

2025-07-11 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly introduces "how to troubleshoot Kubernetes faults". In daily operation, I believe many people have doubts about how to troubleshoot Kubernetes faults. The editor consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts of "how to troubleshoot Kubernetes faults"! Next, please follow the editor to study!

Key points

3 steps to troubleshoot Kubernetes deployment:

Should ensure the normal operation of Pods

Ensure that the service can schedule traffic to the Pod

Check that the portal is configured correctly.

Visual illustration

First, check that the Pod has been created and is normal.

Second, if the Pod is normal, you should check whether the service can assign traffic to the Pod.

Finally, check the connection between the service and the portal.

Pod troubleshooting

In most cases, the problem lies with Pod itself. You should make sure that Pod is running and ready (READY is 1).

Inspection method:

Kubectl get pods

As in the above session, the last Pod is in the "Running" and "ready" states, the first two Pod are not in the Running state, and the state is not "ready".

Key points

You can use the following commands to troubleshoot Pod:

Kubectl logs

Used to view Pod container logs

Kubectl describe pod

Used to view a list of events related to Pod

Kubectl get pod

: used to get the YAML definition of Pod.

Kubectl exec-ti

Bash: an interactive terminal for entering the Pod container. List of common Pod errors

Various startup and runtime errors may occur in Pod.

Startup error:

ImagePullBackoff,ImageInspectError,ErrImagePull,ErrImageNeverPull,RegistryUnavailable,InvalidImageName

Runtime error:

CrashLoopBackOff,RunContainerError,KillContainerError,VerifyNonRootError,RunInitContainerError,CreatePodSandboxError,ConfigPodSandboxError,KillPodSandboxError,SetupNetworkError,TeardownNetworkError

Key error codes and their repair methods

ImagePullBackOff

This error occurs when Kubernetes is unable to retrieve an image from one of the Pod containers.

There are three main reasons:

The mirror name is invalid. For example, the name is typed incorrectly, or the image does not exist.

A label that does not exist is specified for the mirror.

The image you are trying to retrieve belongs to a private registry, but Kubernetes does not have permission to access it.

Solution:

The first two cases can be solved by changing the image name and label.

Third, you need to add credentials to the registry and reference them in Pod.

There is an example of how to achieve this goal in the official documentation.

CrashLoopBackOff

If the container fails to start, Kubernetes status displays a CrashLoopBackOff error.

Typically, Pod cannot be started by the container in the following situations:

An error occurred in the application, preventing it from starting

The container is not configured correctly

Liveness probe failed too many times

Solution:

You should check the log in the container for detailed reasons for the failure.

Kubectl logs--previous

RunContainerError

An error occurs when the container cannot start until the application within the container starts.

This problem is usually due to configuration errors, such as:

Mount a volume that does not exist, such as ConfigMap or Secrets

Install a read-only volume as read-write

Solution:

Kubectl describe pod should be used to collect and analyze errors for this error.

Pod is in the pending state

When a Pod is created, the Pod remains in a pending state. The main possible reasons are:

The cluster does not have sufficient resources (such as CPU and memory) to run Pod

The current namespace has a ResourceQuota object, and creating a Pod will make the namespace exceed the quota

Pod is bound to a pending PersistentVolumeClaim

Solution:

Check the event section of the kubectl describe command:

Kubectl describe pod

For errors caused by ResourceQuotas, you can check the logs of the cluster using the following methods:

Kubectl get events-sort-by=.metadata.creationTimestamp

Pod is not ready

If Pod is running but not ready, the ready probe failed.

When the ready probe fails, the Pod is not connected to the service and no traffic is forwarded to the instance.

Solution method

The ready probe failure is an application-specific error, so you should check the "events" section of the kubectl description to identify the error.

Service troubleshooting

If your Pod is running and ready, but still cannot receive a response from the application, you should check that the service is configured correctly.

Key points

The main function of the service is to route traffic to the Pod based on the label of the traffic. Therefore, you should first check how many Pod the service has located, which can be seen by examining the endpoints in the service:

Kubectl describe service | grep Endpoints

Endpoints are a pair, and there should be at least one when the service (at least) targets Pod.

If the Endpoint section is empty, there are two reasons:

Pod with the correct label is not running and should be checked to see if it is in the correct namespace.

There is a wrong word in the selector tag of the service

If you can see the list of endpoints, but still cannot access the application, it is largely due to a misconfiguration of targetPort in the service.

You can troubleshoot specifically by connecting to the service using kubectl port-forward:

Kubectl port-forward service/3000:80

Entry troubleshooting

If the Pod is running normally and the service can assign traffic to the Pod, it may be due to a misconfiguration of the portal:

Depending on the type of controller that may be used at the entrance, you need to debug according to the specific corresponding method.

Key points

Check that the portal configuration parameters serviceName and servicePort are configured correctly. You can check using the following command:

Kubectl describe ingress

If the backend column is empty, there must be an error in the configuration.

If you can see the port in the backend column, but still cannot access the application, you may have the following problem:

There is no how to publish the portal to the public network; how to publish the cluster to the public network

Infrastructure problems can be isolated from the portal by connecting directly to the Ingress Pod.

First, take a look at the entry controller Pod list:

Kubectl get pods-all-namespaces

Second, use the kubectl describe command to view the port:

Kubectl describe pod nginx-ingress-controller-6fc5bcc

Finally, connect to Pod:

Kubectl port-forward nginx-ingress-controller-6fc5bcc 3000 80-- namespace kube-system

In this way, when you access port 3000 on the computer, the request is forwarded to port 80 on Pod. Can I use the app now?

If possible, the problem lies in the infrastructure. You should check how traffic is dispatched to the cluster.

If not, the problem lies in the ingress controller. The entry controller should be debugged. Common entry controls include Nginx,HAProxy,Traefik, etc., you can view specific controller-related documents for problem troubleshooting. Here we take Nginx as an example:

Troubleshoot Nginx controller

The Ingress-nginx project is the official plug-in for Kubectl. You can use kubectl ingress-nginx to do the following:

View logs, backends, certificates, etc.

Connect to the entrance

Check the current configuration.

The corresponding commands are:

Kubectl ingress-nginx lint: used to check nginx.conf

Kubectl ingress-nginx backend: used to check the backend (similar to kubectl describe ingress

)

Kubectl ingress-nginx logs: view the controller log.

At this point, the study on "how to troubleshoot Kubernetes" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.