What is the method of container resource control in kubernetes 07/01 Update SLTechnology News&Howtos

What is the method of container resource control in kubernetes

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces the relevant knowledge of what is the method of container resource control in kubernetes, the content is detailed and easy to understand, the operation is simple and fast, and has a certain reference value. I believe you will gain something after reading this article on container resource control in kubernetes. Let's take a look.

1. Source of Pod resource control ApiVersion: v1kind: Podmetadata: name: frontendspec: containers:-name: db image: mysql resources: requests: memory: "64Mi" cpu: "250m" limits: memory: "128Mi" cpu: "500m"-name: wp image: wordpress resources: requests: memory: "64Mi" cpu: "250m" Limits: memory: "128Mi" cpu: "500m"

The configuration of container resource control by K8s is under pod.spec.containers []. Resources, which is divided into requests and limits. In other words, although the basic scheduling unit of K8s is pod, the control resources are still at the container level. So what's the difference between request and limit?

Pod in K8s, except for explicitly stating pod.spec.Nodename, generally goes through this process.

Pod (created, written to etcd)-- > Scheduler watch to pod.spec.nodeName-- > kubelet watch on the node to own, create the container and run

Look at this code in the scheduler and this code.

Func GetResourceRequest (pod * v1.Pod) * schedulercache.Resource {result: = schedulercache.Resource {} for _, container: = range pod.Spec.Containers {for rName, rQuantity: = range container.Resources.Requests {factory.RegisterPriorityFunction2 ("MostRequestedPriority", priorities.MostRequestedPriorityMap, nil, 1)}

We can draw the first conclusion:

Only requests is used for scheduling, not limits. At run time, both are used.

At runtime, the ways in which these parameters work as a whole are as follows

K8s-- > docker-> linux cgroup

Finally, the resource limitations of these processes are controlled by the operating system kernel.

two。 Run-time conversion k8s-> docker

Or paste the code first, this code has a lot of information.

Func (m * kubeGenericRuntimeManager) generateLinuxContainerConfig (container * v1.Container, pod * v1.Pod, uid * int64, username string) * runtimeapi.LinuxContainerConfig {lc: = & runtimeapi.LinuxContainerConfig {Resources: & runtimeapi.LinuxContainerResources {}, SecurityContext: m.determineEffectiveSecurityContext (pod, container, uid, username) } / / set linux container resources var cpuShares int64 cpuRequest: = container.Resources.Requests.Cpu () cpuLimit: = container.Resources.Limits.Cpu () memoryLimit: = container.Resources.Limits.Memory () .Value () oomScoreAdj: = int64 (qos.GetContainerOOMScoreAdjust (pod, container, int64 (m.machineInfo.MemoryCapacity) / / If request is not specified, but limit is, we want request to default to limit. / / API server does this for new containers, but we repeat this logic in Kubelet / / for containers running on existing Kubernetes clusters. If cpuRequest.IsZero () &! cpuLimit.IsZero () {cpuShares = milliCPUToShares (cpuLimit.MilliValue ())} else {/ / if cpuRequest.Amount is nil, then milliCPUToShares will return the minimal number / / of CPU shares. CpuShares = milliCPUToShares (cpuRequest.MilliValue ())} lc.Resources.CpuShares = cpuShares if memoryLimit! = 0 {lc.Resources.MemoryLimitInBytes = memoryLimit} / / Set OOM score of the container based on qos policy. Processes in lower-priority pods should / / be killed first if the system runs out of memory. Lc.Resources.OomScoreAdj = oomScoreAdj if m.cpuCFSQuota {/ / if cpuLimit.Amount is nil, then the appropriate default value is returned / / to allow full usage of cpu resource. CpuQuota, cpuPeriod: = milliCPUToQuota (cpuLimit.MilliValue ()) lc.Resources.CpuQuota = cpuQuota lc.Resources.CpuPeriod = cpuPeriod} return lc}

And this part.

Func milliCPUToShares (milliCPU int64) int64 {if milliCPU = = 0 {/ / Return 2 here to really match kernel default for zero milliCPU. Return minShares} / / Conceptually (milliCPU / milliCPUToCPU) * sharesPerCPU, but factored to improve rounding. Shares: = (milliCPU * sharesPerCPU) / milliCPUToCPU if shares < minShares {return minShares} return shares} / / milliCPUToQuota converts milliCPU to CFS quota and period valuesfunc milliCPUToQuota (milliCPU int64) (quota int64 Period int64) {/ / CFS quota is measured in two values: / /-cfs_period_us=100ms (the amount of time to measure usage across) / /-cfs_quota=20ms (the amount of cpu time allowed to be used across a period) / / so in the above example, you are limited to 20% of a single CPU / / for multi-cpu environments You just scale equivalent amounts if milliCPU = 0 {return} / / we set the period to 100ms by default period = quotaPeriod / / we then convert your milliCPU to a value normalized over a period quota = (milliCPU * quotaPeriod) / milliCPUToCPU / / quota needs to be a minimum of 1ms. If quota < minQuotaPeriod {quota = minQuotaPeriod} return}

Continue to use a table to describe the relationship

Docker parameter processing cpuShares if requests.cpu is 0 and limits.cpu is non-0, take limits.cpu as the conversion input value, otherwise from request.cpu as the conversion input value. The conversion algorithm is: if the conversion input value is 0, it is set to minShares = = 2, otherwise a very complex algorithm is used for * 1024/100oomScoreAdj to divide containers into three categories, as detailed in reference document 3. The descending order of priority is: Guaranteed, Burstable, Best-Effort, and negative correlation with request.memory. The more negative this value is, the less likely it is to kill cpuQuota. CpuPeriod is converted from limits.cpu. The default cpuPeriod is 100ms, and cpuQuota is the core number of limits.cpu * 100ms. This is a hard limit memoryLimit== limits.memory3. The meaning of these values in docker

So, when K8s outputs this stack of memory CPU values to docker, how does docker interpret these values?

# docker help run | grep cpu--cpu-percent int CPU percent (Windows only)-- cpu-period int Limit CPU CFS (Completely Fair Scheduler) period-- cpu-quota int Limit CPU CFS (Completely Fair Scheduler) quota-c,-- cpu-shares int CPU shares (relative weight)-cpuset-cpus string CPUs in which to allow execution (0-3 0to 1)-- cpuset-mems string MEMs in which to allow execution (0-3,0Pol. 1) # docker help run | grep oom--oom-kill-disable Disable OOM Killer-- oom-score-adj int Tune host's OOM preferences (- 1000 to 1000) # docker help run | grep memory-- kernel-memory string Kernel memory limit-m -- memory string Memory limit-- memory-reservation string Memory soft limit-- memory-swap string Swap limit equal to memory plus swap:'- 1' to enable unlimited swap-- memory-swappiness int Tune container memory swappiness (0 to 100) (default-1)

It's quite complicated, so there are several key values. I copied the following part from this blog and wrote it in great detail. Of course, the original source is still the official document of docker.

At the same time, for a running container, you can use docker inspect to see

# docker inspect c8dcd083baba | grep Cpu "CpuShares": 1024, "CpuPeriod": 0, "CpuQuota": 0, "CpusetCpus": "", "CpusetMems": "", "CpuCount": 0, "CpuPercent": 0 or 3.1 CPU share constraint:-c or-- cpu-shares

By default, all containers have the same proportion of CPU utilization.-c or-- cpu-shares can set the CPU utilization weight, which defaults to 1024, and can be set to 2 or higher (1024 for a single CPU, 2048 for two, and so on). If the setting option is 0, the system ignores the option and uses the default value of 1024. The above settings are only reflected when CPU-intensive (busy) processes are running. When a container is idle, other containers can occupy the CPU. The cpu-shares value is a relative value, and the actual CPU utilization depends on the number of containers running on the system.

If the host of a 1core is running 3 container, one cpu-shares is set to 1024 and the other cpu-shares is set to 512. When processes in three containers try to use 100% CPU, "it is important to try to use 100% CPU, this is the time to reflect the setting value", then setting 1024 of the container will take up 50% of the CPU time. If you add another container with a cpu-shares of 1024, the CPU utilization ratio of the two containers set to 1024 is 33%, while the other two are 16.5%. The simple algorithm is that all the set values are added together, and the percentage of each container is the utilization of CPU. If there is only one container, the CPU utilization will be 100% regardless of whether it is set to 512 or 1024. Of course, if the host is 3core and runs three containers, two cpu-shares is set to 512 and one is set to 1024, then each container can occupy one of the CPU at 100%.

Test host "4core" when there is only 1 container, you can use any CPU:

➜~ docker run-it-- rm-- cpu-shares 512 ubuntu-stress:latest / bin/bashroot@4eb961147ba6:/# stress-c 4stress: info: [17] dispatching hogs: 4 cpu, 0 io, 0 vm 0 hdd ➜~ docker stats 4eb961147ba6CONTAINER CPU% MEM USAGE / LIMIT MEM% NET I kB O BLOCK I/O4eb961147ba6 398.05% 741.4 kB / 8.297 GB 0.01% 4.88 kB / 3.2 CPU period constraint:-- cpu-period &-- cpu-quota

The default CPU CFS "Completely Fair Scheduler" period is 100ms. We can restrict the CPU usage of the container with the-cpu-period value. Generally, cpu-period is used together with cpu-quota.

Set cpu-period to 100ms, which means a maximum of 2 cpu can be used, as shown in the following test:

➜~ docker run-it-- rm-- cpu-period=100000-- cpu-quota=200000 ubuntu-stress:latest / bin/bashroot@6b89f2bda5cd:/# stress-c 4 # stress test uses 4 cpustress: info: [17] dispatching hogs: 4 cpu, 0 io, 0 vm 0 hdd ➜~ docker stats 6b89f2bda5cd # stats shows that the current container CPU utilization does not exceed 200%CONTAINER CPU% MEM USAGE / LIMIT MEM% NET I BLOCK I/O6b89f2bda5cd O BLOCK I/O6b89f2bda5cd 200.68% 745.5 kB / 8.297 GB 0.01% 4.771 kB / 648B 0 B / 0B

From the above tests, we can see that-- cpu-period combined-- cpu-quota configuration is fixed, regardless of whether the CPU is free or busy. As configured above, the container can only use a maximum of 2 CPU to 100%.

CFS documentation on bandwidth limiting

3. 3-oom-score-adj

Oom-score-adj is a parameter, which is used to kill first in the system memory OOM. Negative scores are the least likely to be erased, while positive scores are most likely to be erased. This is completely different from infinite horror, ha.

You can see that in the code of k8s, the scores of several different types of container are as follows

Const (/ / PodInfraOOMAdj is very docker specific. For arbitrary runtime, it may not make / / sense to set sandbox level oom score, e.g. A sandbox could only be a namespace / / without a process. / / TODO: Handle infra container oom score adj in a runtime agnostic way. PodInfraOOMAdj int =-998 KubeletOOMScoreAdj int =-999 DockerOOMScoreAdj int =-999 KubeProxyOOMScoreAdj int =-999 guaranteedOOMScoreAdj int =-998 besteffortOOMScoreAdj int = 1000)

It can be seen that guarantee's container is killed last, and besteffort is the first to be killed.

4.-- Measurement of cpu-shares

To try the actual effect of setting the default value of cpu-share to 2 in K8s, a script is written

Cat cpu.shfunction aa {x=0while [True]; do Xerox Xbox 1;}

Copy it into the container and run. The container is based on busybox image.

# docker run-- cpu-shares=2 run_out_cpu / bin/sh / cpu.sh top-15:41:23 up 57 days, 22:51, 3 users, load average: 4.66,7.50, 6.16Tasks: 396total, 2 running, 394sleeping, 0 stopped, 0 zombie%Cpu (s): 12.2 us, 0.4 sy, 0.0 ni, 87.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 stKiB Mem: 8154312 total 7923816 used, 230496 free, 278512 buffersKiB Swap: 3905532 total, 39060 used, 3866472 free. 6525112 cached MemPID USER PR NI VIRT RES SHR S% CPU% MEM TIME+ COMMAND 39986 root 20 0 1472 364 188 R 99.6 0.0 1 sh 2098 root 20 0 116456 8948 1864 S 0.7 0.1 67 16.83 acc-snf

As you can see, when-- cpu-shares is set to 2, 100% of a single CPU is used, and then, without using a container, only a separate script is run on the host

# cat cpu8.shfunction aa {x=0while [True]; do Xerox / aa;} aa & aa &

This machine has eight cores, so it takes eight processes to run full.

Therefore, the top at this time is:

Top-15:44:30 up 57 days, 22:54, 3 users, load average: 4.21,5.23, 5.47Tasks: 404 total, 10 running, 394 sleeping, 0 stopped, 0 zombie%Cpu (s): 82.6 us, 16.9 sy, 0.0 ni, 0.0 id, 0.0 wa, 0.1 hi, 0.0 si, 0.5 stKiB Mem: 8154312 total, 7932176 used, 222136 free, 278536 buffersKiB Swap: 3905532 total, 39 060 used 3866472 free. 6525116 cached MemPID USER PR NI VIRT RES SHR S% CPU% MEM TIME+ COMMAND 40374 root 20 0 15112 1376 496 R 100.0 0 15112 1376 40373 root 20 0 15244 1380 496 R 99.9 0 0 bash 40 376 root 20 1 5240 1 376 496 R 99.9 0 0 15244 28.80 bash 40375 root 20 0 15240 1376 496 R 99.6 0.0 0:13.15 bash 40379 root 20 0 15240 1376 496 R 99.6 0.0 0:28.75 bash 40378 root 20 0 15240 1376 496 R 99.3 0.0 0:28.79 bash 40380 root 20 0 15116 1380 496 R 98.9 0.0 0:28.84 bash 39986 root 20 0 1696 460 188 R 89.0 0.0 4:13.48 sh 40377 root 20 0 15240 1376 496 R 12.0 0.0 0:19.27 bash

As you can see, the sh process running with the container takes up about 88% of the CPU. If the-- cpu-share is set to 512,

Top-15:50:25 up 57 days, 23:00, 3 users, load average: 8.93,7.86, 6.60Tasks: 407 total, 10 running, 397 sleeping, 0 stopped, 0 zombie%Cpu (s): 80.2 us, 16.5 sy, 0.0 ni, 2.8 id, 0.0 wa, 0.0 hi, 0.0 si, 0.4 stKiB Mem: 8154312 total, 7951320 used, 202992 free, 278748 buffersKiB Swap: 3905532 total, 39 060 used 3866472 free. 6525176 cached MemPID USER PR NI VIRT RES SHR S% CPU% MEM TIME+ COMMAND 41012 root 20 0 1316 328 188 R 100.0 0.0 MEM TIME+ COMMAND 07.89 sh 40376 root 20 17928 4084 496 R 89.0 0.15 sh 46.70 bash 40374 root 20 017928 4084 496 R 88.60.1 6:21.40 bash 40380 root 20 0 17420 3300 496 R 87.3 0.0 6:23.22 bash 40375 root 20 0 17928 4084 496 R 86.7 0.1 6:07.00 bash 40377 root 20 0 17928 3556 496 R 85.3 0.0 5:51.28 bash 40373 Root 20 0 17932 4088 496 R 81.7 0.1 6:22.83 bash 40379 root 20 0 17928 3296 496 R 81.0 0.0 2:16.05 bash 40378 root 20 0 17928 4084 496 R 75.7 0.1 6:21.70 bash

It can occupy 100% of the CPU, which is not affected by the process of eating CPU regularly on the host.

5. Summary

Thank you for reading this long and lengthy document, just to illustrate the use of the four parameters in k8s, to sum up:

Request.cpu 's pit

For this value, if a container is already configured on a single machine, such as 1 core, then the container has a preemptive CPU capacity of 1024 cpu 2 compared to a container without CPU configuration, which is too different for unconfigured containers. So, I don't quite understand why K8s sets the default value minShares to 2 instead of 1024 of docker.

Therefore, if pod is deployed to some machines where cpu resources are already insufficient, if this value is set to 1 core, it may not be able to schedule, but if it is not set, the resources of this pod may not be guaranteed.

Requests.memory 's pit

This value itself is not a big hole, mainly divided into Guaranteed, Burstable, Best-Effort, and not explicitly specified, but to see whether requests and limits have the same item or something, to be honest, this is also very painful, but the pain is slightly lower.

This is the end of the article on "what is the method of controlling container resources in kubernetes?" Thank you for reading! I believe you all have a certain understanding of the knowledge of "what is the method of container resource control in kubernetes". If you want to learn more, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.