How to adjust the sensitivity of HPA expansion according to different business scenarios 04/19 Update SLTechnology News&Howtos

How to adjust the sensitivity of HPA expansion according to different business scenarios

2025-04-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

In this issue, the editor will bring you about how to adjust the HPA expansion sensitivity according to different business scenarios. The article is rich in content and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Background

Before K8s 1.18, HPA expansion could not adjust the sensitivity:

For capacity reduction, the time window for capacity reduction is controlled by the-horizontal-pod-autoscaler-downscale-stabilization-window parameter of kube-controller-manager. The default is 5 minutes, that is, you need to wait at least 5 minutes after the load is reduced.

For capacity expansion, the speed of capacity expansion is controlled by hpa controller fixed algorithm and hard-coded constant factor, which can not be customized.

This kind of design logic makes it impossible for users to customize the capacity expansion sensitivity of HPA, and different business scenarios may have different requirements for capacity expansion sensitivity, such as:

For critical services with sudden traffic, the capacity should be expanded quickly when needed (even if it may not be needed, just in case), but the capacity should be reduced slowly (to prevent another traffic peak).

For some offline services that need to deal with a large amount of data, the capacity should be expanded as soon as possible to reduce the processing time, and when not so many resources are needed, the capacity should be reduced as soon as possible to save costs.

Businesses that handle regular data / network traffic may scale up and down in a general way to reduce jitter.

HPA ushered in an update in K8s 1.18. In the previous version of v2beta2, the scaling sensitivity control was added, but the version number remained the same as v2beta2.

How to use

This update actually adds a behavior field under HPA Spec. Below, there are two fields, scaleUp and scaleDown, to control the behavior of capacity expansion and reduction, respectively. Here are some examples of usage scenarios.

Rapid expansion

When your application needs rapid expansion, you can use a HPA configuration similar to the following:

ApiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata: name: webspec: minReplicas: 1 maxReplicas: 1000 metrics:-pods: metric: name: k8s_pod_rate_cpu_core_used_limit target: averageValue: "80" type: AverageValue type: Pods scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: web behavior: # here is the focus scaleUp: policies:-type: percent value: 900%

The above configuration means adding 9 times the current number of copies immediately when you expand the capacity, that is, immediately expanding the capacity to 10 times the current number of Pod, of course, it cannot exceed the limit of maxReplicas.

If there is only one Pod at the beginning, if it encounters a traffic burst, it will expand its capacity at a rapid speed. During the expansion, the number of Pod changes as follows:

1-> 10-> 100-> 1000

If no downsizing policy is configured, you will wait for the global default time window for downsizing (--horizontal-pod-autoscaler-downscale-stabilization-window, default is 5 minutes) and then start downsizing.

Rapid expansion and slow reduction

If the traffic peak passes and the concurrent traffic drops sharply, if the default capacity reduction strategy is used, the number of Pod will also drop sharply in a few minutes. If Pod suddenly has a traffic peak after downsizing, although the capacity can be expanded rapidly, the expansion process still takes a certain amount of time. If the traffic peak is high enough, it may still cause the back-end processing capacity to lag behind, resulting in the failure of some requests. At this point, we can add a downsizing policy for HPA. An example of HPA behavior configuration is as follows:

Behavior: scaleUp: policies:-type: percent value: 900% scaleDown: policies:-type: pods value: 1 periodSeconds: 600 # only one Pod is shrunk every 10 minutes

In the above example, the configuration of scaleDown is added, and only 1 Pod is scaled down every 10 minutes when downsizing is specified, which greatly reduces the speed of downsizing. The changing trend of the number of Pod during downsizing is as follows:

1000->... (10 min later)-> 999

This allows key businesses to maintain their processing capacity when there may be traffic bursts, avoiding the failure of some requests caused by peak traffic.

Slow expansion

If you want your application to be less critical, and don't be too sensitive when you expand it, you can make it expand smoothly and slowly, and add the following behavior for HPA:

Behavior: scaleUp: policies:-type: pods value: 1 # only one Pod is added for each expansion

If there is only one Pod at the beginning, the change trend of its Pod number during capacity expansion is as follows:

1-> 2-> 3-> 4 automatic downsizing is prohibited

If the application is critical and you want to avoid automatic capacity reduction after capacity expansion, manual intervention or other self-developed controller is required to determine the conditions for capacity reduction. You can use the following behavior configuration to disable automatic capacity reduction:

Behavior: scaleDown: policies:-type: pods value: 0 extend the downsizing window

The default time window for downsizing is 5 min (--horizontal-pod-autoscaler-downscale-stabilization-window). If we need to extend the time window to avoid some traffic burrs, we can specify a time window for downsizing. An example of behavior configuration is as follows:

Behavior: scaleDown: stabilizationWindowSeconds: 600# wait 10 minutes before starting downsizing policies:-type: pods value: 5 # only reduce 5 Pod at a time

The above example indicates that when the load falls, it will wait for 600 seconds (10 minutes) before scaling down, only 5 Pod at a time.

Extend the expansion time window

Some applications often have data burrs that lead to frequent expansion, but the expanded Pod is not necessary and a waste of resources. For example, in the scenario of data processing pipeline, the expansion index is the number of events in the queue. When a large number of events are accumulated in the queue, we want to expand the capacity quickly, but we do not want to be too sensitive, because it may only accumulate events in a short period of time. Even if the capacity is not expanded, it can be quickly disposed of.

The default expansion algorithm will expand the capacity in a relatively short period of time. For this scenario, we can add a time window to the expansion to avoid the waste of resources caused by burr. An example of behavior configuration is as follows:

Behavior: scaleUp: stabilizationWindowSeconds: 300 # wait 5 minutes before capacity expansion window policies:-type: pods value: 20 # add 20 Pod for each expansion

The above example indicates that when you expand the capacity, you need to wait for a 5-minute time window. If the load falls during this period, the capacity will not be expanded. If the load continues to exceed the expansion threshold, the capacity will be expanded, and 20 Pod will be added for each expansion.

This is what the editor shares with you on how to adjust the HPA scaling sensitivity according to different business scenarios. If you happen to have similar doubts, please refer to the above analysis for understanding. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.