How does Kubernetes Autoscaling work? 04/30 Update SLTechnology News&Howtos

How does Kubernetes Autoscaling work?

2025-04-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

Today, I would like to talk to you about how Kubernetes Autoscaling works. Many people may not know much about it. In order to make you understand better, the editor has summarized the following for you. I hope you can get something from this article.

How does Kubernetes Autoscaling work? This is a question that we are often asked recently.

The following will explain how the Kubernetes Autoscaling function works and the advantages that can be provided when scaling a cluster.

What is Autoscaling?

Imagine filling two buckets with a faucet. We want to make sure that when the first bucket is filled with 80% of the water, the second bucket begins to fill. The solution is simple, as long as a pipe connection is installed between the two buckets in the right place. And when we want to expand the amount of water, we only need to increase the bucket in this way.

The same goes for our applications or services, and the elastic scaling of cloud computing frees us from manually adjusting physical servers / virtual machines. So compare "buckets of water" with "applications consume computing resources"--

Buckets-zoom units-explain what we scale

80% markers-measurements and triggers for scaling-explain when we zoom

Pipes-implement scaling operations-explain how we zoom

What do we scale?

In a Kubernetes cluster environment, as users, we generally scale two things:

Pods-for an application, suppose we run X copies (replica), and when the request exceeds the processing capacity of X Pods, we need to extend the application. In order for this process to work seamlessly, our Nodes should have sufficient available resources to successfully schedule and execute these additional Pads

Nodes-the total capacity of all Nodes represents our cluster capacity. If the workload demand exceeds this capacity, we need to add nodes to the cluster to ensure efficient scheduling and execution of the workload. If the Pods continues to expand, there may be a situation where the available resources of the nodes are about to be exhausted, and we have to add more nodes to increase the overall resources available at the cluster level.

When do I zoom?

In general, we measure a metric continuously, and when the metric exceeds the threshold, we manipulate it by scaling a resource. For example, we might need to measure the average CPU consumption of Pod and then trigger a zoom operation when the CPU consumption exceeds 80%.

But one metric is not suitable for all use cases, and it may vary for different types of applications-- for message queues, the number of messages waiting may be used as a metric; for memory-intensive applications, memory consumption may be more appropriate. If we have a business application that can handle about 1000 transactions per second in a given capacity pane, we may choose this indicator and expand when the Pods reaches more than 850s.

We only considered the extension above, but when workload usage drops, there should be a way to reduce it moderately without interrupting existing requests that are being processed.

How do I zoom?

For Pods, you just need to change the number of replicas in replication; for Nodes, we need a way to call the cloud service provider's API to create a new instance and make it part of the cluster.

Kubernetes Autoscaling

Based on the above understanding, let's look at the specific implementation and technology of Kubernetes Autoscaling--

Cluster Autoscaler

Cluster Autoscaler (Cluster Auto Scaler) is used to dynamically scale a cluster (Nodes). Its function is to continuously monitor Pods. Once it is found that Pods cannot be schedule, it is extended based on PodConditoin. This approach is much more effective than looking at the percentage of CPU in the cluster. Since Nodes creation takes a minute or more (depending on factors such as cloud computing service providers), it may take some time for Pods to be Schedule.

Within the cluster, we may have multiple Nodes Pool, such as Nodes Pool for billing applications and another Nodes Pool for machine learning workloads. Cluster Autoscaler provides a variety of tags and methods to adjust Nodes scaling behavior. See https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/FAQ.md for more details.

For Scale down, Cluster Autoscaler looks at the average utilization of the Nodes and refers to other relevant factors, such as if Pods (Pod disruption Budget) is running on a Node that cannot be rescheduled, then the Node cannot be removed from the cluster. Custer Autoscaler provides a way to terminate the Nodes normally, and you can generally relocate the Pods within 10 minutes.

Horizontal Pod Autoscaler (HPA)

HPA is a control loop for monitoring and scaling Pods in a deployment. This can be done by creating a HPA object that references the deployment / reolication controller. We can define the threshold and the upper and lower limits of the scale of the deployment. The earliest version of HPA, GA (autoscaling/v1), only supports CPU as a monitorable metric. The current version of HPA is in testing phase (autoscaling/v2beta1) to support memory and other custom metrics. Once the HPA object is created and it can query the metrics in the pane, you can see that it reports details:

$kubectl get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEhelloetst-ownay28d Deployment/helloetst-ownay28d 8% / 60% 1 4 1 23 h

We can make some adjustments to the horizontal Pod Autoscaler by adding Flags to the Controller Manager:

Flags-horizontal-pod-autoscaler-sync-period was used to determine the monitoring frequency of hPa for Pods group indicators. The default period is 30 seconds.

The default interval between the two expansion operations is 3 minutes, which can be controlled by Flags-horizontal-pod-autoscaler-upscale-delay

The default interval between the two zoom-out operations is 5 minutes, which can also be controlled by Flags-horizontal-pod-autoscaler-downscale-delay

Metrics and Cloud provider

To measure metrics, the server should enable Heapster or API aggregation while enabling Kubernetes Custom Metrics (https://github.com/kubernetes/metrics). API metrics server is the preferred method above the Kubernetes1.9 version. For configuring Nodes, we should enable and configure the appropriate cloud provider in the cluster. See https://kubernetes.io/docs/concepts/cluster-administration/cloud-providers/ for more details.

Some plug-ins

There are also some great plug-ins, such as--

Vertical pod autoscaler https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler

Addon-resizer https://github.com/kubernetes/autoscaler/tree/master/addon-resizer

All in all, the next time someone asks, "how does Kubernetes Autoscaling work"? I hope this passage will be helpful to your explanation.

A series of concepts proposed by Kubernetes are abstract, which are very consistent with the ideal distributed scheduling system. However, a large number of difficult technical concepts also form a steep learning curve, which directly raises the threshold for the use of Kubernetes.

In addition, Kubernetes itself is a container orchestration tool and does not provide management processes, while Rainbond provides ready-made management processes, including DevOps, automated operation and maintenance, micro-service architecture and application market, which can be used right out of the box.

After reading the above, do you have any further understanding of how Kubernetes Autoscaling works? If you want to know more knowledge or related content, please follow the industry information channel, thank you for your support.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.