In-depth understanding of Kubernetes resource limitations: CPU 07/08 Update SLTechnology News&Howtos

In-depth understanding of Kubernetes resource limitations: CPU

2025-07-08 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

written in front

In the previous article on Kubernetes resource limits we discussed how to set container memory limits in pods via ResourceRequirements and how container runtimes leverage Linux Cgroups to implement these limits. It also analyzes the difference between requests, which are used to inform the scheduler of Pod resource requirements, and limits, which help the kernel limit resources when the host is under memory pressure.

In this article, I'll continue to delve into CPU time requests and limits. Whether or not you read the first article doesn't matter, but I suggest you read both articles to get an engineer's or cluster administrator's perspective on cluster control.

CPU time

As I pointed out in the first article, limiting CPU time is more complicated than limiting memory. The good news is that limiting CPU is also controlled according to the cgroups mechanism we learned earlier. The principle of limiting memory is common, so we only need to pay attention to a few details. We start by adding CPU time limits to the previous example:

resources:

requests:

memory: 50Mi

cpu: 50m

limits:

memory: 100Mi

cpu: 100m

The unit suffix m stands for "one thousandth of a core," so this resource object defines that the container process needs 50/1000 cores (5%) and uses up to 100/1000 cores (10%). Similarly, 2000m represents two complete cores, but it can also be represented by 2 or 2.0. Let's create a Pod with only CPU requests and see how Docker configures cgroups:

$ kubectl run limit-test --image=busybox --requests "cpu=50m" --command - /bin/sh -c "while true; do sleep 2; done"

deployment.apps "limit-test" created

We can see that Kubernetes has configured 50m CPU requests:

$ kubectl get pods limit-test-5b4c495556-p2xkr -o=jsonpath='{.spec.containers[0].resources}'

[cpu:50m]]

We can also see Docker configured with the same limits:

$ docker ps | grep busy | cut -d' ' -f1

f2321226620e

$ docker inspect f2321226620e --format '{{.HostConfig.CpuShares}}'

Why 51 and not 50? CPU cgroup and Docker both divide a core into 1024, while Kubernetes divides it into 1000. So how does Docker apply it to container processes? Setting the memory limit causes Docker to configure the memory cgroup of the process, and setting the CPU limit causes it to configure the cpu, cpuacct cgroup.

$ ps ax | grep /bin/sh

60554 ? Ss 0:00 /bin/sh -c while true; do sleep 2; done

$ sudo cat /proc/60554/cgroup

…

4:cpu,cpuacct:/kubepods/burstable/pode12b33b1-db07-11e8-b1e1-42010a800070/3be263e7a8372b12d2f8f8f9b4251f110b79c2a3bb9e6857b2f1473e640e8e75

ls -l /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/pode12b33b1-db07-11e8-b1e1-42010a800070/3be263e7a8372b12d2f8f8f9b4251f110b79c2a3bb9e6857b2f1473e640e8e75

total 0

drwxr-xr-x 2 root root 0 Oct 28 23:19 .

drwxr-xr-x 4 root root 0 Oct 28 23:19 …

…

-rw-r-r-- 1 root root 0 Oct 28 23:19 cpu.shares

Docker's HostConfig.CpuShares container property maps to cgroup's cpu.shares, so let's look at it:

$ sudo cat /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/podb5c03ddf-db10-11e8-b1e1-42010a800070/64b5f1b636dafe6635ddd321c5b36854a8add51931c7117025a694281fb11444/cpu.shares

You might be surprised to find that setting up a CPU request sends this value to cgroup, whereas setting up memory in the previous article does not. The following line of kernel behavior with soft memory limits is not useful for Kubernetes, but setting cpu.shares is useful. I'll explain that later. So what happens when we set cpu limits? Let's find out together:

$ kubectl run limit-test --image=busybox --requests "cpu=50m" --limits "cpu=100m" --command - /bin/sh -c "while true; do sleep 2; done"

deployment.apps "limit-test" created

Now let's go back to the Kubernetes Pod resource object limitations:

$ kubectl get pods limit-test-5b4fb64549-qpd4n -o=jsonpath='{.spec.containers[0].resources}'

map[limits:map[cpu:100m] requests:map[cpu:50m]]

In Docker container configuration:

$ docker ps | grep busy | cut -d' ' -f1

f2321226620e

$ docker inspect 472abbce32a5 --format '{{.HostConfig.CpuShares}} {{.HostConfig.CpuQuota}} {{.HostConfig.CpuPeriod}}'

51 10000 100000

As we can see, CPU requests are stored in the HostConfig.CpuShares property. CPU limits, although not so obvious, are represented by two values HostConfig.CpuPeriod and HostConfig.CpuQuota, which map Docker container configurations to two attributes of the cpu, cpuacct cgroup of the process: cpu.cfs_period_us and cpu.cfs_quota_us. Let's take a closer look:

$ sudo cat /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/pod2f1b50b6-db13-11e8-b1e1-42010a800070/f0845c65c3073e0b7b0b95ce0c1eb27f69d12b1fe2382b50096c4b59e78cdf71/cpu.cfs_period_us

100000

$ sudo cat /sys/fs/cgroup/cpu,cpuacct/kubepods/burstable/pod2f1b50b6-db13-11e8-b1e1-42010a800070/f0845c65c3073e0b7b0b95ce0c1eb27f69d12b1fe2382b50096c4b59e78cdf71/cpu.cfs_quota_us

10000

As we expected, both configurations will be configured into Docker container configurations as well. But how do these values translate from the Pod's 100m CPU limit, and how do they get there? CPU requests and CPU limits are controlled by two different cgroups. Requests uses a CPU fragmentation system and is the earlier of the two. CPU shards divide each core into 1024 shards and ensure that each process receives a certain percentage of CPU shards. If there are only 1024 chips and both processes have CPU.shares set to 512, then both processes will get half the CPU time each. CPU fragmentation does not specify an upper bound, meaning that if a process does not use its share, other processes can.

Around 2010 Google and a few other companies noticed this possible problem. This in turn incorporates an even more powerful system for second-response: CPU bandwidth control. The bandwidth control system defines a period, usually 1/10th of a second, or 100000 microseconds, and a quota indicating the maximum number of fragments a process can use in a period. In this example, we requested 100mCPU for our Pod, which equates to 100/1000 cores, or 10000/100000 milliseconds of CPU time. So our CPU requests are translated to set the cpu of the process, the configuration of cpuacct is cpu.cfs_period_us=100000 and cpu.cfs_quota_us=10000. cfs stands for completely fair scheduling, which is the default CPU scheduler for Linux. There is also a real-time scheduler that responds to quota values.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.