In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-27 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly introduces "how to troubleshoot Kubernetes faults". In daily operation, I believe many people have doubts about how to troubleshoot Kubernetes faults. The editor consulted all kinds of data and sorted out simple and easy-to-use methods of operation. I hope it will be helpful for you to answer the doubts of "how to troubleshoot Kubernetes faults"! Next, please follow the editor to study!
Key points
3 steps to troubleshoot Kubernetes deployment:
Should ensure the normal operation of Pods
Ensure that the service can schedule traffic to the Pod
Check that the portal is configured correctly.
Visual illustration
First, check that the Pod has been created and is normal.
Second, if the Pod is normal, you should check whether the service can assign traffic to the Pod.
Finally, check the connection between the service and the portal.
Pod troubleshooting
In most cases, the problem lies with Pod itself. You should make sure that Pod is running and ready (READY is 1).
Inspection method:
Kubectl get pods
As in the above session, the last Pod is in the "Running" and "ready" states, the first two Pod are not in the Running state, and the state is not "ready".
Key points
You can use the following commands to troubleshoot Pod:
Kubectl logs
Used to view Pod container logs
Kubectl describe pod
Used to view a list of events related to Pod
Kubectl get pod
: used to get the YAML definition of Pod.
Kubectl exec-ti
Bash: an interactive terminal for entering the Pod container. List of common Pod errors
Various startup and runtime errors may occur in Pod.
Startup error:
ImagePullBackoff,ImageInspectError,ErrImagePull,ErrImageNeverPull,RegistryUnavailable,InvalidImageName
Runtime error:
CrashLoopBackOff,RunContainerError,KillContainerError,VerifyNonRootError,RunInitContainerError,CreatePodSandboxError,ConfigPodSandboxError,KillPodSandboxError,SetupNetworkError,TeardownNetworkError
Key error codes and their repair methods
ImagePullBackOff
This error occurs when Kubernetes is unable to retrieve an image from one of the Pod containers.
There are three main reasons:
The mirror name is invalid. For example, the name is typed incorrectly, or the image does not exist.
A label that does not exist is specified for the mirror.
The image you are trying to retrieve belongs to a private registry, but Kubernetes does not have permission to access it.
Solution:
The first two cases can be solved by changing the image name and label.
Third, you need to add credentials to the registry and reference them in Pod.
There is an example of how to achieve this goal in the official documentation.
CrashLoopBackOff
If the container fails to start, Kubernetes status displays a CrashLoopBackOff error.
Typically, Pod cannot be started by the container in the following situations:
An error occurred in the application, preventing it from starting
The container is not configured correctly
Liveness probe failed too many times
Solution:
You should check the log in the container for detailed reasons for the failure.
Kubectl logs--previous
RunContainerError
An error occurs when the container cannot start until the application within the container starts.
This problem is usually due to configuration errors, such as:
Mount a volume that does not exist, such as ConfigMap or Secrets
Install a read-only volume as read-write
Solution:
Kubectl describe pod should be used to collect and analyze errors for this error.
Pod is in the pending state
When a Pod is created, the Pod remains in a pending state. The main possible reasons are:
The cluster does not have sufficient resources (such as CPU and memory) to run Pod
The current namespace has a ResourceQuota object, and creating a Pod will make the namespace exceed the quota
Pod is bound to a pending PersistentVolumeClaim
Solution:
Check the event section of the kubectl describe command:
Kubectl describe pod
For errors caused by ResourceQuotas, you can check the logs of the cluster using the following methods:
Kubectl get events-sort-by=.metadata.creationTimestamp
Pod is not ready
If Pod is running but not ready, the ready probe failed.
When the ready probe fails, the Pod is not connected to the service and no traffic is forwarded to the instance.
Solution method
The ready probe failure is an application-specific error, so you should check the "events" section of the kubectl description to identify the error.
Service troubleshooting
If your Pod is running and ready, but still cannot receive a response from the application, you should check that the service is configured correctly.
Key points
The main function of the service is to route traffic to the Pod based on the label of the traffic. Therefore, you should first check how many Pod the service has located, which can be seen by examining the endpoints in the service:
Kubectl describe service | grep Endpoints
Endpoints are a pair, and there should be at least one when the service (at least) targets Pod.
If the Endpoint section is empty, there are two reasons:
Pod with the correct label is not running and should be checked to see if it is in the correct namespace.
There is a wrong word in the selector tag of the service
If you can see the list of endpoints, but still cannot access the application, it is largely due to a misconfiguration of targetPort in the service.
You can troubleshoot specifically by connecting to the service using kubectl port-forward:
Kubectl port-forward service/3000:80
Entry troubleshooting
If the Pod is running normally and the service can assign traffic to the Pod, it may be due to a misconfiguration of the portal:
Depending on the type of controller that may be used at the entrance, you need to debug according to the specific corresponding method.
Key points
Check that the portal configuration parameters serviceName and servicePort are configured correctly. You can check using the following command:
Kubectl describe ingress
If the backend column is empty, there must be an error in the configuration.
If you can see the port in the backend column, but still cannot access the application, you may have the following problem:
There is no how to publish the portal to the public network; how to publish the cluster to the public network
Infrastructure problems can be isolated from the portal by connecting directly to the Ingress Pod.
First, take a look at the entry controller Pod list:
Kubectl get pods-all-namespaces
Second, use the kubectl describe command to view the port:
Kubectl describe pod nginx-ingress-controller-6fc5bcc
Finally, connect to Pod:
Kubectl port-forward nginx-ingress-controller-6fc5bcc 3000 80-- namespace kube-system
In this way, when you access port 3000 on the computer, the request is forwarded to port 80 on Pod. Can I use the app now?
If possible, the problem lies in the infrastructure. You should check how traffic is dispatched to the cluster.
If not, the problem lies in the ingress controller. The entry controller should be debugged. Common entry controls include Nginx,HAProxy,Traefik, etc., you can view specific controller-related documents for problem troubleshooting. Here we take Nginx as an example:
Troubleshoot Nginx controller
The Ingress-nginx project is the official plug-in for Kubectl. You can use kubectl ingress-nginx to do the following:
View logs, backends, certificates, etc.
Connect to the entrance
Check the current configuration.
The corresponding commands are:
Kubectl ingress-nginx lint: used to check nginx.conf
Kubectl ingress-nginx backend: used to check the backend (similar to kubectl describe ingress
)
Kubectl ingress-nginx logs: view the controller log.
At this point, the study on "how to troubleshoot Kubernetes" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.