In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly introduces "what is the special treatment of Critical Pod by Daemonset Controller". In the daily operation, I believe that many people have doubts about the special treatment of Critical Pod by Daemonset Controller. The editor has consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "what is the special treatment of Critical Pod by Daemonset Controller?" Next, please follow the editor to study!
Special treatment of CriticalPod by Daemonset Controller
When DaemonSetController determines whether or not to run a DaemonSet on a node, DaemonSetsController.simulate is called to analyze the PredicateFailureReason.
Pkg/controller/daemon/daemon_controller.go:1206func (dsc * DaemonSetsController) simulate (newPod * v1.Pod, node * v1.Node, ds * apps.DaemonSet) ([] algorithm.PredicateFailureReason, * schedulercache.NodeInfo, error) {/ / DaemonSet pods shouldn't be deleted by NodeController in case of node problems. / / Add infinite toleration for taint notReady:NoExecute here / / to survive taint-based eviction enforced by NodeController / / when node turns not ready. V1helper.AddOrUpdateTolerationInPod (newPod, & v1.Toleration {Key: algorithm.TaintNodeNotReady, Operator: v1.TolerationOpExists, Effect: v1.TaintEffectNoExecute,}) / / DaemonSet pods shouldn't be deleted by NodeController in case of node problems. / / Add infinite toleration for taint unreachable:NoExecute here / / to survive taint-based eviction enforced by NodeController / / when node turns unreachable. V1helper.AddOrUpdateTolerationInPod (newPod, & v1.Toleration {Key: algorithm.TaintNodeUnreachable, Operator: v1.TolerationOpExists, Effect: v1.TaintEffectNoExecute,}) / According to TaintNodesByCondition, all DaemonSet pods should tolerate / / MemoryPressure and DisPressure taints, and the critical pods should tolerate / / OutOfDisk taint additional. V1helper.AddOrUpdateTolerationInPod (newPod, & v1.Toleration {Key: algorithm.TaintNodeDiskPressure, Operator: v1.TolerationOpExists, Effect: v1.TaintEffectNoSchedule,}) v1helper.AddOrUpdateTolerationInPod (newPod, & v1.Toleration {Key: algorithm.TaintNodeMemoryPressure, Operator: v1.TolerationOpExists, Effect: v1.TaintEffectNoSchedule }) / / TODO (# 48843) OutOfDisk taints will be removed in 1.10 if utilfeature.DefaultFeatureGate.Enabled (features.ExperimentalCriticalPodAnnotation) & & kubelettypes.IsCriticalPod (newPod) {v1helper.AddOrUpdateTolerationInPod (newPod, & v1.Toleration {Key: algorithm.TaintNodeOutOfDisk, Operator: v1.TolerationOpExists Effect: v1.TaintEffectNoSchedule,})}... _, reasons, err: = Predicates (newPod, nodeInfo) return reasons, nodeInfo, err}
DeamonSetController will add the following Toleratoins to Pod to prevent the following Conditions from being killed by Node Controller Taint-based eviction in Node.
NotReady:NoExecute
Unreachable:NoExecute
MemoryPressure:NoSchedule
DisPressure:NoSchedule
When ExperimentalCriticalPodAnnotation Feature Gate Enable, and the Pod is a CriticalPod, OutOfDisk:NoSchedule Toleration is also added to the Pod.
In simulate, Predicates processing is also done like scheduler. CriticalPod is also treated differently in the process of Predicates.
Pkg/controller/daemon/daemon_controller.go:1413// Predicates checks if a DaemonSet's pod can be scheduled on a node using GeneralPredicates// and PodToleratesNodeTaints predicatefunc Predicates (pod * v1.Pod, nodeInfo * schedulercache.NodeInfo) (bool, [] algorithm.PredicateFailureReason, error) {var predicateFails [] algorithm.PredicateFailureReason / / If ScheduleDaemonSetPods is enabled, only check nodeSelector and nodeAffinity. If false / * disabled for 1.10 percent / & & utilfeature.DefaultFeatureGate.Enabled (features.ScheduleDaemonSetPods) {fit, reasons, err: = nodeSelectionPredicates (pod, nil, nodeInfo) if err! = nil {return false, predicateFails, err} if! fit {predicateFails = append (predicateFails Reasons...)} return len (predicateFails) = = 0, predicateFails, nil} critical: = utilfeature.DefaultFeatureGate.Enabled (features.ExperimentalCriticalPodAnnotation) & & kubelettypes.IsCriticalPod (pod) fit, reasons, err: = predicates.PodToleratesNodeTaints (pod, nil, nodeInfo) if err! = nil {return false, predicateFails Err} if! fit {predicateFails = append (predicateFails, reasons...)} if critical {/ / If the pod is marked as critical and support for critical pod annotations is enabled, / check predicates for critical pods only. Fit, reasons, err = predicates.EssentialPredicates (pod, nil, nodeInfo)} else {fit, reasons, err = predicates.GeneralPredicates (pod, nil, nodeInfo)} if err! = nil {return false, predicateFails, err} if! fit {predicateFails = append (predicateFails, reasons...)} return len (predicateFails) = = 0, predicateFails, nil}
If it is CriticalPod, call predicates.EssentialPredicates, otherwise call predicates.GeneralPredicates.
What's the difference between GeneralPredicates and EssentialPredicates here? In fact, GeneralPredicates has more noncriticalPredicates processing than EssentialPredicates, that is, PodFitsResources in Scheduler's Predicate.
Pkg/scheduler/algorithm/predicates/predicates.go:1076// noncriticalPredicates are the predicates that only non-critical pods needfunc noncriticalPredicates (pod * v1.Pod, meta algorithm.PredicateMetadata, nodeInfo * schedulercache.NodeInfo) (bool, [] algorithm.PredicateFailureReason, error) {var predicateFails [] algorithm.PredicateFailureReason fit, reasons, err: = PodFitsResources (pod, meta, nodeInfo) if err! = nil {return false, predicateFails Err} if! fit {predicateFails = append (predicateFails, reasons...)} return len (predicateFails) = = 0, predicateFails, nil}
Therefore, there is no PodFitsResources check when Predicate is performed on CriticalPod,DeamonSetController.
Special treatment of CriticalPod by PriorityClass Validate
An important update in Kubernetes 1.11 is that Priority and Preemption have been upgraded from alpha to Beta and to Enabled by default.
Kubernetes VersionPriority and Preemption StateEnabled by default1.8alphano1.9alphano1.10alphano1.11betayes
PriorityClass belongs to scheduling.k8s.io/v1alpha1GroupVersion. After client submits the request to create PriorityClass, and before writing to etcd, a validity check (Validate) is performed, including special treatment for SystemClusterCritical and SystemNodeCritical.
Pkg/apis/scheduling/validation/validation.go:30// ValidatePriorityClass tests whether required fields in the PriorityClass are// set correctly.func ValidatePriorityClass (pc * scheduling.PriorityClass) field.ErrorList {... / / If the priorityClass starts with a system prefix, it must be one of the / / predefined system priority classes. If strings.HasPrefix (pc.Name, scheduling.SystemPriorityClassPrefix) {if is, err: = scheduling.IsKnownSystemPriorityClass (pc);! is {allErrs = append (allErrs, field.Forbidden (field.NewPath ("metadata", "name"), "priority class names with'" + scheduling.SystemPriorityClassPrefix+ "'prefix are reserved for system use only. Error:" + err.Error ())}}. Return allErrs} / / IsKnownSystemPriorityClass checks that "pc" is equal to one of the system PriorityClasses.// It ignores "description", labels, annotations, etc. Of the PriorityClass.func IsKnownSystemPriorityClass (pc * PriorityClass) (bool, error) {for _, spc: = range systemPriorityClasses {if spc.Name = = pc.Name {if spc.Value! = pc.Value {return false Fmt.Errorf ("value of% v PriorityClass must be v", spc.Name, spc.Value)} if spc.GlobalDefault! = pc.GlobalDefault {return false, fmt.Errorf ("globalDefault of% v PriorityClass must be% v", spc.Name, spc.GlobalDefault)} return true Nil}} return false, fmt.Errorf ("% v is not a known system priority class", pc.Name)}
For Validate of PriorityClass, if PriorityClass's Name is prefixed with * * system-**, it must be one of system-cluster-critical or system-node-critical. Otherwise, it will Validate Error and refuse to submit.
If the submitted PriorityClass's Name is system-cluster-critical or system-node-critical, then the globalDefault must be false, that is, system-cluster-critical or system-node-critical cannot be the global default PriorityClass.
In addition, when PriorityClass Update, its Name and Value are not allowed, that is, only Description and globalDefault can be updated.
Pkg/apis/scheduling/helpers.go:27// SystemPriorityClasses define system priority classes that are auto-created at cluster bootstrapping.// Our API validation logic ensures that any priority class that has a system prefix or its value// is higher than HighestUserDefinablePriority is equal to one of these SystemPriorityClasses.var systemPriorityClasses = [] * PriorityClass {{ObjectMeta: metav1.ObjectMeta {Name: SystemNodeCritical,}, Value: SystemCriticalPriority + 1000 Description: "Used for system critical pods that must not be moved from their current node.",}, {ObjectMeta: metav1.ObjectMeta {Name: SystemClusterCritical,}, Value: SystemCriticalPriority, Description: "Used for system critical pods that must run in the cluster, but can be moved to another node if necessary." },} so far The study on "what is the special treatment of Critical Pod by Daemonset Controller" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.