In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
This article mainly explains "what is the Kubernetes Resource QoS mechanism". Friends who are interested may wish to have a look at it. The method introduced in this paper is simple, fast and practical. Let's let the editor take you to learn what the Kubernetes Resource QoS mechanism is.
Kubernetes Resource QoS Classes introduction
Kubernetes defines the QoS Class of Pod based on the values of request and limit of Containers Resource in Pod.
For each Resource, the container can be divided into three QoS Classes: Guaranteed, Burstable, and and Best-Effort, and their QoS levels are decreasing in turn.
Guaranteed if the limit and request of all Resource of all Container in Pod are equal and not zero, then the QoS Class of this Pod is Guaranteed.
Note that if a container specifies only limit but not request, the value of request is equal to the value of limit.
Examples:
Containers: name: foo resources: limits: cpu: 10m memory: 1Gi name: bar resources: limits: cpu: 100m memory: 100Micontainers: name: foo resources: limits: cpu: 10m memory: 1Gi Requests: cpu: 10m memory: 1Gi name: bar resources: limits: cpu: 100m memory: 100Mi requests: cpu: 100m memory: 100Mi
Best-Effort if all request and limit of Resource in all containers in Pod are not assigned, then the QoS Class of this Pod is Best-Effort.
Examples:
Containers: name: foo resources: name: bar resources:
The Pod QoS Class of Burstable belongs to Burstable except that it matches the scenarios of Guaranteed and Best-Effort.
When the limit value is not specified, its valid value is actually the Capacity of the corresponding Node Resource.
Examples:
Container bar does not specify Resource.
Containers: name: foo resources: limits: cpu: 10m memory: 1Gi requests: cpu: 10m memory: 1Gi name: bar
Containers foo and bar specify different Resource.
Containers: name: foo resources: limits: memory: 1Gi name: bar resources: limits: cpu: 100m
Container foo does not specify limit, container bar does not specify request and limit.
Containers: name: foo resources: requests: cpu: 10m memory: 1Gi name: bar the difference between compressible / incompressible resources
Kube-scheduler scheduling is completed by Node Select based on the request value of Pod. Pod and all of its Container do not allow the valid value (if have) specified by Consume limit.
How the request and limit are enforced depends on whether the resource is compressible or incompressible.
Compressible Resource Guarantees
For now, we are only supporting CPU.
Pods are guaranteed to get the amount of CPU they request, they may or may not get additional CPU time (depending on the other jobs running). This isn't fully guaranteed today because cpu isolation is at the container level. Pod level cgroups will be introduced soon to achieve this goal.
Excess CPU resources will be distributed based on the amount of CPU requested. For example, suppose container A requests for 600 milli CPUs, and container B requests for 300 milli CPUs. Suppose that both containers are trying to use as much CPU as they can. Then the extra 10 milli CPUs will be distributed to An and B in a 2:1 ratio (implementation discussed in later sections).
Pods will be throttled if they exceed their limit. If limit is unspecified, then the pods can use excess CPU when available.
Incompressible Resource Guarantees
For now, we are only supporting memory.
Pods will get the amount of memory they request, if they exceed their memory request, they could be killed (if some other pod needs memory), but if pods consume less memory than requested, they will not be killed (except in cases where system tasks or daemons need more memory).
When Pods use more memory than their limit, a process that is using the most amount of memory, inside one of the pod's containers, will be killed by the kernel.
Admission/Scheduling Policy
Pods will be admitted by Kubelet & scheduled by the scheduler based on the sum of requests of its containers. The scheduler & kubelet will ensure that sum of requests of all containers is within the node's allocatable capacity (for both memory and CPU).
How to recycle Resources according to different QoS
CPU Pods will not be killed if CPU guarantees cannot be met (for example if system tasks or daemons take up lots of CPU), they will be temporarily throttled.
Memory Memory is an incompressible resource and so let's discuss the semantics of memory management a bit.
Best-Effort pods will be treated as lowest priority. Processes in these pods are the first to get killed if the system runs out of memory. These containers can use any amount of free memory in the node though.
Guaranteed pods are considered top-priority and are guaranteed to not be killed until they exceed their limits, or if the system is under memory pressure and there are no lower priority containers that can be evicted.
Burstable pods have some form of minimal resource guarantee, but can use more resources when available. Under system memory pressure, these containers are more likely to be killed once they exceed their requests and no Best-Effort pods exist.
OOM Score configuration at the Nodes
Pod OOM score configuration
Note that the OOM score of a process is 10 times the% of memory the process consumes, adjusted by OOM_SCORE_ADJ, barring exceptions (e.g. Process is launched by root). Processes with higher OOM scores are killed.
The base OOM score is between 0 and 1000, so if process A's OOM_SCORE_ADJ-process bones OOM_SCORE_ADJ is over a 1000, then process A will always be OOM killed before B.
The final OOM score of a process is also between 0 and 1000
Best-effort
Set OOM_SCORE_ADJ: 1000
So processes in best-effort containers will have an OOM_SCORE of 1000
Guaranteed
Set OOM_SCORE_ADJ:-998
So processes in guaranteed containers will have an OOM_SCORE of 0 or 1
Burstable
If total memory request > 99.8% of available memory, OOM_SCORE_ADJ: 2
Otherwise, set OOM_SCORE_ADJ to 1000-10 * (% of memory requested)
This ensures that the OOM_SCORE of burstable pod is > 1
If memory request is 0, OOM_SCORE_ADJ is set to 999.
So burstable pods will be killed if they conflict with guaranteed pods
If a burstable pod uses less memory than requested, its OOM_SCORE < 1000
So best-effort pods will be killed if they conflict with burstable pods using less than requested memory
If a process in burstable pod's container uses more memory than what the container had requested, its OOM_SCORE will be 1000, if not its OOM_SCORE will be < 1000
Assuming that a container typically has a single big process, if a burstable pod's container that uses more memory than requested conflicts with another burstable pod's container using less memory than requested, the former will be killed
If burstable pod's containers with multiple processes conflict, then the formula for OOM scores is a heuristic, it will not ensure "Request and Limit" guarantees.
Pod infra containers or Special Pod init process
OOM_SCORE_ADJ:-998
Kubelet, Docker
OOM_SCORE_ADJ:-999 (won't be OOM killed)
Hack, because these critical tasks might die if they conflict with guaranteed containers. In the future, we should place all user-pods into a separate cgroup, and set a limit on the memory they can consume.
Source code analysis
QoS source code is located in: pkg/kubelet/qos, the code is very simple, mainly two files pkg/kubelet/qos/policy.go,pkg/kubelet/qos/qos.go.
The OOM_SCORE_ADJ corresponding to each QoS Class discussed above is defined as follows:
Pkg/kubelet/qos/policy.go:21const (PodInfraOOMAdj int =-998 KubeletOOMScoreAdj int =-999 DockerOOMScoreAdj int =-999 KubeProxyOOMScoreAdj int =-999 guaranteedOOMScoreAdj int =-998 besteffortOOMScoreAdj int = 1000)
The OOM_SCORE_ADJ of the container is calculated as follows:
Pkg/kubelet/qos/policy.go:40func GetContainerOOMScoreAdjust (pod * v1.Pod, container * v1.Container, memoryCapacity int64) int {switch GetPodQOS (pod) {case Guaranteed: / / Guaranteed containers should be the last to get killed. Return guaranteedOOMScoreAdj case BestEffort: return besteffortOOMScoreAdj} / / Burstable containers are a middle tier, between Guaranteed and Best-Effort. Ideally, / / we want to protect Burstable containers that consume less memory than requested. / / The formula below is a heuristic. A container requesting for 10% of a system's / / memory will have an OOM score adjust of 900. If a process in container Y / / uses over 10% of memory, its OOM score will be 1000. The idea is that containers / / which use more than their request will have an OOM score of 1000 and will be prime / / targets for OOM kills. / / Note that this is a heuristic, it won't work if a container has many small processes. MemoryRequest: = container.Resources.Requests.Memory () .Value () oomScoreAdjust: = 1000-(1000*memoryRequest) / memoryCapacity / / A guaranteed pod using 100% of memory can have an OOM score of 10.Ensure / / that burstable pods have a higher OOM score adjustment. If int (oomScoreAdjust) < (1000 + guaranteedOOMScoreAdj) {return (1000 + guaranteedOOMScoreAdj)} / / Give burstable pods a higher chance of survival over besteffort pods. If int (oomScoreAdjust) = = besteffortOOMScoreAdj {return int (oomScoreAdjust-1)} return int (oomScoreAdjust)}
The method to get the QoS Class of Pod is:
Pkg/kubelet/qos/qos.go:50// GetPodQOS returns the QoS class of a pod.// A pod is besteffort if none of its containers have specified any requests or limits.// A pod is guaranteed only when requests and limits are specified for all the containers and they are equal.// A pod is burstable if limits and requests do not match across all containers.func GetPodQOS (pod * v1.Pod) QOSClass {requests: = v1.ResourceList {} limits: = v1.ResourceList {} zeroQuantity: = resource.MustParse ("0") isGuaranteed: = true for _ Container: = range pod.Spec.Containers {/ / process requests for name Quantity: = range container.Resources.Requests {if! supportedQoSComputeResources.Has (string (name)) {continue} if quantity.Cmp (zeroQuantity) = = 1 {delta: = quantity.Copy () if _ Exists: = requests [name] ! exists {requests [name] = * delta} else {delta.Add (requests[ name]) requests [name] = * delta } / / process limits qosLimitsFound: = sets.NewString () for name Quantity: = range container.Resources.Limits {if! supportedQoSComputeResources.Has (string (name)) {continue} if quantity.Cmp (zeroQuantity) = = 1 {qosLimitsFound.Insert (string (name)) Delta: = quantity.Copy () if _ Exists: = limits [name] ! exists {limits [name] = * delta} else {delta.Add (certificates [name]) limits [name] = * delta } if len (qosLimitsFound)! = len (supportedQoSComputeResources) {isGuaranteed = false}} if len (requests) = = 0 & & len (limits) = 0 {return BestEffort} / / Check is requests match limits for all resources. If isGuaranteed {for name, req: = range requests {if lim, exists: = limits [name] ! exists | | lim.Cmp (req)! = 0 {isGuaranteed = false break} if isGuaranteed & & len (requests) = = len (limits) {return Guaranteed} return Burstable}
PodQoS will be called in the Predicates phase of eviction_manager and scheduler, which means it will be used in the K8s processing overconfiguration and scheduling pre-selection phase.
At this point, I believe you have a deeper understanding of "what the Kubernetes Resource QoS mechanism is". You might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.