Analysis and solution of Common problems in the process of Rancher 2.0 deployment 07/01 Update SLTechnology News&Howtos

Analysis and solution of Common problems in the process of Rancher 2.0 deployment

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/01 Report--

This article is to share with you about the analysis and solutions of common problems in the deployment of Rancher 2.0. the editor thinks it is very practical, so I share it with you to learn. I hope you can get something after reading this article.

Common problems in the deployment and use of Rancher 2.0 and their solutions, most of which are collected from users' questions and feedback in Rancher's official technical exchange group. Environmental demand

Recommended operating system

Ubuntu 16.04 (64-bit)

Red Hat Enterprise Linux 7.5 (64-bit)

RancherOS 1.3.0 (64-bit)

Recommended hardware configuration

Supported docker version

1.12.6

1.13.1

17.03.02

Please allow the firewall to pass through the lower port.

Common problems and troubleshooting ideas

Environmental information residue

Most of the problems in the current deployment are due to the operating system of the deployment environment, or the residual information after multiple deployments and upgrades.

Before or during deployment, use the following command to clean up all kinds of information about the environment:

Openssh version is too low

Centos or rhel systems with version earlier than 7.4.Because the default openssh and openssl and Red Hat ssh have AllowTcpForwarding turned off by default, the following problems will occur when deploying rke:

Refer to issue:

Https://github.com/rancher/rke/issues/93

You need to do the following:

Make sure your openssh version is greater than or equal to 7.x

Modify sshd configuration to open and restart sshd

The default centos and rhel cannot use root users for ssh tunnel, so you need to use a normal user

And add this user to docker, this Group,useradd-G docker yourusername.

Only one machine can access the nodeport port

You can only access the nodeport of one host and the same machine as the pod. This problem is mainly due to cross-cluster network problems or local firewall problems. The train of thought for investigation is as follows:

1. Check the local telnet localhost:nodeort of the host machine to see if it can be connected, and telnet each other within the cluster. If it is not possible to connect to the root deployment environment network, it is recommended to contact the network administrator for troubleshooting.

If the native telnet does not work, do the following test.

2. First of all, we need or take the corresponding pod information

For example, my test-6b4cdf4ccb-7pzt6 is on the rancher-kf-worker01 node, and its ip is 10.42.3.23

3. First ping the ip on the host where the pod is located, and then go to the other nodes to see if ping can be connected. In canal network mode, check whether the firewall port 8472/UDP is open. Check to see if there are any attempts to use each machine's flannel.1 network card every day, and if so, ping each other with the ip on flannel.1 to see if it can be connected, because the flannel network and the canal network establish a vxlan tunnel with each other through the flannel.1 network card. It is recommended that the action be tested with the firewall off.

Deployment using calico network deployment environment failed

When deploying rancher2.0, if the network type is calico, public cloud will be selected if cloud provider is left empty by default, resulting in deployment failure, so we need to manually enter it as none here. (this item will be optimized later)

Host not found issues during deployment

This problem occurs because the host name of the host does not meet the standard hostname requirements of kubernetes or the standard linux hostname, and there can be no underscore in the hostname.

Get component health status forbidden question

Most of the reasons are due to multiple deployments and residual certificates. The solution is to empty the environment according to the method in the residual environmental information and re-add it.

Web page kubectl flashback problem

This main root operating system version is related to the browser version. Please use the operating system recommended above. The browser uses Chrome.

Pod problem that non-worker nodes are still scheduled

Currently, rancher2.0 non-worker nodes will still be dispatched by pod. You can choose to manually remove them from kube-scheduler, as follows:

After getting the name of the node in the kubernetes cluster

Open the web page kubectl

And then execute

Kubectl taint node rancher-kf-control01 node-role.kubernetes.io/rancher-kf-control01= "": NoSchedulekubectl taint node rancher-kf-control02 node-role.kubernetes.io/rancher-kf-control02= "": NoSchedulekubectl taint node rancher-kf-control03 node-role.kubernetes.io/rancher-kf-control03= "": NoSchedule

It is a not share mount problem

When you encounter a share mount problem during deployment, the error message is as follows:

FATA [0180] [workerPlane] Failed to bring up Worker Plane: Failed to start [kubelet] container on host [192.168.10.51]: Error response from daemon: linux mounts: Path / var/lib/kubelet is mounted on / but it is not a shared mount.

The main reason for this problem is kubelet containerization deployment, which requires manually setting the MuntFLAGS of docker to be empty.

Https://github.com/kubernetes/kubernetes/issues/4869#issuecomment-195696990

Solution:

Execution

Mount-- make-shared /

Or configure docker.server

MountFlags=shared

Restart docker.service

NetworkRedy=false problem

The problem is usually that the network components are initialized at the time of deployment and wait for a period of time in the configuration. Or view the docker logs kubelet of the kubelet log on the corresponding node.

Cluster unavailable

Usually this problem is due to a problem with the connection of kube-apiserver 6443 port in the root kubernetes of rancher-server. It is recommended to check the firewall and check the log of kube-api-server.

Summary

1. If you can deploy in strict accordance with the official operating system version and docker version, many problems can be avoided.

2. To deploy and upgrade the environment many times, the environment must be cleaned up in accordance with the commands of the environmental information residue chapter.

3. If you encounter a problem, it is recommended that docker logs check the log of rancher-agent,rancher-server.

The above is the analysis and solution of common problems in the deployment of Rancher 2.0. the editor believes that there are some knowledge points that we may see or use in our daily work. I hope you can learn more from this article. For more details, please follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.