In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)06/03 Report--
Taint and Toleration
Node affinity (see here) is an attribute of pod (preference or rigid requirement) that attracts pod to a particular class of nodes. Taint, by contrast, enables nodes to repel a specific class of pod.
Taint and toleration work together to prevent pod from being assigned to inappropriate nodes. One or more taint can be applied to each node, which means that pod that cannot tolerate these taint will not be accepted by that node. If toleration is applied to pod, it means that these pod can be dispatched (but not required) to a node with a matching taint.
Concept
Use examples
Eviction based on taint
Add taint based on node statu
Concept
You can add a taint to the node using the command kubectl taint. such as,
Kubectl taint nodes node1 key=value:NoSchedule
Add a taint to the node node1, whose key is key,value, value,effect is NoSchedule. This means that only a pod with a toleration that matches this taint can be assigned to the node1 node. You can define the toleration of pod in PodSpec. The following two toleration match the taint created with the kubectl taint command in the above example, so if a pod has either of these toleration, it can be assigned to the node1:
To remove the taint added by the above command, you can run:
Kubectl taint nodes node1 key:NoSchedule-
You can set tolerance labels for containers in PodSpec. The following two tolerance tags "match" the stain created by the kubectl taint above, so a Pod with any tolerance tag can schedule it to "node1":
Tolerations:
-key: "key"
Operator: "Equal"
Value: "value"
Effect: "NoSchedule"
Tolerations:
-key: "key"
Operator: "Exists"
Effect: "NoSchedule"
A toleration and a taint "match" means that they have the same key and effect, and:
If operator is Exists (toleration cannot specify value at this time), or
If operator is Equal, their value should be equal
Note:
There are two special cases:
If the key of a toleration is empty and the operator is Exists, it means that the toleration matches any key, value, and effect, that is, the toleration can tolerate any taint.
Tolerations:
-operator: "Exists"
If the effect of a toleration is empty, then the effect of the matching taint with the same key value can be any value.
Tolerations:
-key: "key"
Operator: "Exists"
The example above uses one of the values of effect, NoSchedule, and you can also use another value, PreferNoSchedule. This is an "optimized" or "soft" version of NoSchedule-the system will * try to * avoid dispatching pod to nodes where it cannot tolerate taint, but this is not mandatory. The value of effect can also be set to NoExecute, which is described in more detail below.
You can add multiple taint to a node, or you can add multiple toleration to a pod. Kubernetes handles multiple taint and toleration like a filter: traversing all the taint of a node, filtering out the taint of those pod that have matching toleration. The effect value of the remaining unfiltered taint determines whether the pod will be assigned to that node, especially if:
If there is more than one taint with an effect value of NoSchedule in the unfiltered taint, Kubernetes does not assign the pod to that node.
If there is no taint with effect value of NoSchedule in the unfiltered taint, but there is a taint with effect value of PreferNoSchedule, Kubernetes will * try * to assign pod to that node.
If there is more than one taint with an effect value of NoExecute in the unfiltered taint, Kubernetes does not assign the pod to that node (if pod is not already running on the node) or expel the pod from that node (if pod is already running on the node).
For example, suppose you add the following taint to a node
Kubectl taint nodes node1 key1=value1:NoSchedule
Kubectl taint nodes node1 key1=value1:NoExecute
Kubectl taint nodes node1 key2=value2:NoSchedule
Then there is a pod, which has two toleration
Tolerations:
-key: "key1"
Operator: "Equal"
Value: "value1"
Effect: "NoSchedule"
-key: "key1"
Operator: "Equal"
Value: "value1"
Effect: "NoExecute"
In this example, the above pod will not be assigned to the above node because it does not have a toleration to match the third taint. But if the pod is already running on the node before adding the above taint to the node, it can continue to run on that node, because the third taint is the only one of the three taint that cannot be tolerated by this pod.
In general, if you add a taint with an effect value of NoExecute to a node, any pod that cannot tolerate the taint will be expelled immediately, and any pod that can tolerate the taint will not be expelled. However, if there is an effect value in pod that specifies the optional attribute tolerationSeconds for the toleration of NoExecute, it indicates the time that pod can continue to run on the node after the above taint has been added to the node. For example,
Tolerations:
-key: "key1"
Operator: "Equal"
Value: "value1"
Effect: "NoExecute"
TolerationSeconds: 3600
This means that if the pod is running and a matching taint is added to its node, the pod will continue to run on the node for 3600 seconds and then be expelled. If the above taint is deleted before then, the pod will not be expelled.
Use examples
With taint and toleration, you have the flexibility to let pod * avoid * certain nodes or expel pod from certain nodes. Here are a few examples of usage:
Dedicated nodes: if you want to assign certain nodes specifically to a specific group of users, you can add a taint (that is, kubectl taint nodes nodename dedicated=groupName:NoSchedule) to those nodes, and then add a corresponding toleration to the pod of this group of users (you can easily do this by writing a custom admission controller). The pod with the above toleration can be assigned to the above dedicated nodes, as well as to other nodes in the cluster. If you want these pod to be assigned only to the above dedicated nodes, then you also need to add another label similar to the above taint (for example: dedicated=groupName) to these dedicated nodes, and to add node affinity to the pod in the above admission controller requires that the above pod can only be assigned to the nodes with dedicated=groupName tags.
Nodes with special hardware: in clusters where some nodes are equipped with special hardware (such as GPU), we hope that pod that does not need such hardware will not be assigned to these special nodes in order to reserve resources for subsequent pod that need such hardware. To achieve this, you can first add a taint (such as kubectl taint nodes nodename special=true:NoSchedule or kubectl taint nodes nodename special=true:PreferNoSchedule) to a node equipped with special hardware, and then add a matching toleration to the pod that uses such special hardware. Similar to the dedicated node example, the easiest way to add this toleration is to use a custom admission controller. For example, we recommend using Extended Resources to represent special hardware, adding taint to nodes configured with special hardware with the extended resource name, and then running an ExtendedResourceToleration admission controller. At this point, because the nodes are already taint, the Pod with no corresponding toleration will be scheduled to these nodes. But when you create a Pod that uses extended resource, ExtendedResourceToleration admission controller automatically adds the correct toleration to the Pod, so that the Pod is automatically dispatched to these nodes with special hardware. This ensures that these nodes with special hardware are dedicated to running the Pod that needs to be used, and that you do not need to manually add toleration to these Pod.
Taint-based eviction (beta feature): this is the eviction behavior configured in each pod if there is a problem with the node, which is described in the following sections
Eviction based on taint
We mentioned earlier that the effect value of taint, NoExecute, will affect the pod that is already running on the node * if pod cannot tolerate taint with effect value of NoExecute, then pod will be expelled immediately * if pod can tolerate taint with effect value of NoExecute, but tolerationSeconds is not specified in the toleration definition, pod will always run on this node. * if pod can tolerate taint with an effect value of NoExecute, and tolerationSeconds is specified, pod can also continue to run this specified length of time on this node.
In addition, Kubernetes 1.6 already supports the representation of node issues (alpha phase). In other words, when a condition is true, node controller automatically adds a taint to the node. The current built-in taint includes:
Node.kubernetes.io/not-ready: node is not ready. This is equivalent to the node state Ready with a value of "False".
Node.kubernetes.io/unreachable:node controller cannot access the node. This is equivalent to the node state Ready with a value of "Unknown".
Node.kubernetes.io/out-of-disk: the node disk is exhausted.
Node.kubernetes.io/memory-pressure: there is memory pressure on the node.
Node.kubernetes.io/disk-pressure: there is disk pressure on the node.
Node.kubernetes.io/network-unavailable: the node network is not available.
Node.kubernetes.io/unschedulable: nodes are not schedulable.
Node.cloudprovider.kubernetes.io/uninitialized: if kubelet starts with an "external" cloudprovider, it will add a taint to the current node and mark it as unavailable. After a controller of cloud-controller-manager initializes the node, kubelet deletes the taint.
In version 1.13, the TaintBasedEvictions feature has been upgraded to Beta and enabled by default, so the stain automatically adds this type of taint to the node, and the above logic to expel pod based on the node state Ready is disabled.
Note:
Note: in order to ensure the normal behavior of pod eviction rate limiting caused by node problems, the system actually adds taint in the way of rate-limited. In scenarios such as master and node communication outages, this avoids mass eviction of pod.
Using this beta feature, in conjunction with tolerationSeconds, pod can specify how long it will run on a node if one or all of the above problems occur.
For example, an application that uses a lot of local state still wants to stay on the current node for a long time when the network is disconnected and is willing to wait for the network to resume to avoid being deported. In this case, the toleration for pod might look like this:
Tolerations:
-key: "node.kubernetes.io/unreachable"
Operator: "Exists"
Effect: "NoExecute"
TolerationSeconds: 6000
Note that Kubernetes automatically adds a toleration with key as node.kubernetes.io/not-ready to pod and configures tolerationSeconds=300 unless a toleration with key as node.kubernetes.io/not-ready already exists in the pod configuration provided by the user. Similarly, Kubernetes adds a toleration with key as node.kubernetes.io/unreachable to pod and configures tolerationSeconds=300, unless a toleration with key as node.kubernetes.io/unreachable already exists in the pod configuration provided by the user.
This auto-add toleration mechanism ensures that the pod stays on the current node for 5 minutes by default when one of the problems is detected. These two default toleration are added by DefaultTolerationSeconds admission controller.
When a pod in DaemonSet is created, the toleration of the NoExecute that is automatically added for the following taint will not specify the tolerationSeconds:
Node.kubernetes.io/unreachable
Node.kubernetes.io/not-ready
This ensures that the pod in DaemonSet will never be expelled when the above problems occur, which is the same as when the TaintBasedEvictions feature is disabled.
Add taint based on node statu
In version 1.12, the TaintNodesByCondition feature has been upgraded to Beta, so the node lifecycle controller automatically creates a stain corresponding to the Node condition. Similarly, the scheduler does not check node conditions. Instead, the scheduler checks for stains. This ensures that node conditions do not affect what is scheduled to the node. Users can choose to ignore some Node problems (expressed as scheduling conditions of Node) by adding appropriate Pod tolerance. Note that TaintNodesByCondition will only contaminate nodes with NoSchedule settings. The NoExecute effect is controlled by TaintBasedEviction, and TaintBasedEviction is a Beta version feature that has been enabled by default since version 1.13.
Node.kubernetes.io/memory-pressure
Node.kubernetes.io/disk-pressure
Node.kubernetes.io/out-of-disk (critical pod only)
Node.kubernetes.io/unschedulable (version 1.10 or later)
Node.kubernetes.io/network-unavailable (host network only)
Adding the above toleration ensures backward compatibility, or you can freely add toleration to the DaemonSet.
Source of documentation:
Https://kubernetes.io/zh/docs/concepts/configuration/taint-and-toleration/
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.