How to configure Pod horizontal automatic extension for Kubernetes 04/21 Update SLTechnology News&Howtos

How to configure Pod horizontal automatic extension for Kubernetes

2025-04-21 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article mainly introduces how to configure Pod level automatic expansion for Kubernetes, which has a certain reference value, and interested friends can refer to it. I hope you can learn a lot after reading this article.

Jieshao

Kubernetes has a powerful feature that encodes and configures auto-scaling on running services. Without auto scaling, it is difficult to adapt to the expansion of the deployment and meet the SLAs. This feature is called Horizontal Pod Autoscaler (HPA).

Why use HPA

With HPA, you can automatically expand and shrink your deployment according to resource usage or custom metrics, making the deployment scale close to the actual service load.

HPA can bring two direct help to your service:

Provide computing and memory resources when they are needed, and release them when they are not needed

Increase / decrease performance as needed to achieve SLA

How HPA works

HPA automatically adjusts the number of pods of replica controllers, deployments, or replica collections (defining the minimum and maximum number of pods) based on monitored CPU/ memory utilization (resource metrics) or custom metrics provided by third-party metrics applications such as Prometheus, Datadog, and so on. HPA is a control loop whose cycle is controlled by the controller manager-horizontal-pod-autoscaler-sync-period flag of Kubernetes (the default is 30s).

HPA definition

HPA is an API resource under the auto-scaling API group in Kubernetes. The current stable version is autoscaling/v1, which only provides support for automatic scaling of CPU. If you want additional support for memory and custom metrics, you can use the Beta version of autoscaling/v2beta1.

You can see more information in the HPA API object: https://git.k8s.io/community/contributors/design-proposals/autoscaling/horizontal-pod-autoscaler.md#horizontalpodautoscaler-object

In general, HPA is supported by kubectl. You can use kubectl to create, manage, and delete:

Create a HPA:

With manifest: kubectl create-f

No manifest (only CPU is supported): kubectl autoscale deployment hello-world-min=2-- man=5-- cpu-percent=50

Get hpa information:

Basic information: kubectl get hpa hello-world

Details: kubectl describe hpa hello-world

Delete hpa:

Kubectl delete hpa hello-world

Here is an example of a HPA manifest definition:

The autoscaling/v2beta1 version is used here, with cpu and memory metrics

Control the automatic scaling of hello-world project deployment

Defines the minimum value of 1 for the copy

Defines the maximum value of 10 for the copy

Resize when satisfied:

CPU usage exceeds 50%

Memory usage exceeds 100Mi

Installation

Before HPA can be used on a Kubernetes cluster, there are some elements that need to be installed and configured on the system.

Need

Check to make sure that the Kubernetes cluster service is running and contains at least these flags:

Kube-api: requestheader-client-ca-file

Kubelet: read-only-port on port 10255

Kube-controller: optional, only if you need to be different from the default value

Horizontal-pod-autoscaler-downscale-delay: "5m0s"

Horizontal-pod-autoscaler-upscale-delay: "3m0s"

Horizontal-pod-autoscaler-sync-period: 30s

For the RKE,Kubernetes cluster definition, make sure that you have added these lines in the services section. If you want to do this in Rancher v2.0.X UI, open "Cluster options"-"Edit as YAML" and add the following definition:

To deploy metrics services, your Kubernetes cluster must be configured and deployed correctly.

Note: this article uses Rancher v2.0.6 and k8s v1.10.1 clusters when deploying and testing the examples

Resource index

If HPA wants to use resource metrics, then you need to use the metrics-server package, which is in the kube-system namespace in the Kubernetes cluster.

Follow these steps to achieve:

Configure kubectl to connect to the correct Kubernetes cluster

Github repository for cloning metrics-server: git clone https://github.com/kubernetes-incubator/metrics-server

Install the metrics-server package (assuming Kubernetes is upgraded to 1.8): kubectl create-f metrics-server/deply/1.8+/

Check to see if metrics-server is working properly. Service pod and logs can be checked in the namespace kube-system

Check whether the metric API can be accessed from kubectl: if you want to access the Kubernetes cluster directly, use kubectl config's server URL, such as' https://:6443'

If you want to access the Kubernetes cluster through Rancher, then kubectl config's server URL should be like: https:///k8s/clusters/, that is, you also need to access the original

Add / k8s/clusters/ after the API path

Custom Metrics (Prometheus)

As a resource, custom metrics can be provided by many third-party applications. We are going to use Prometheus in the demo. Assuming that Prometheus is already deployed in your Kubernetes cluster and it can get the correct metrics from pods, nodes, namespaces, and so on, we will use Prometheus url, http://prometheus.mycompany.io, which is exposed on port 80.

Prometheus can be deployed in the Rancher v2.0 directory. If it is not running on the Kubernetes cluster, deploy it in the Rancher directory.

If HPA wants to use custom metrics in Prometheus, then the kube-system namespace on the Kubernetes cluster needs to use k8s-prometheus-adapter. To facilitate the installation of k8s-prometheus-adapter, we will use the Helm chart provided by banzai-charts.

You can use this chart by following these steps:

Initialize helm on k8s cluster

Clone banzai-charts from the Github repository:

Install prometheus-adapter and specify specific Prometheus URL and port

Check to see if prometheus-adapter is working properly. Check that the service pod and logs are under the kube-system namespace

Check whether the metric API can be accessed from kubectl: directly access the Kubernetes cluster, then the kubectl config server URL is like https://:6443

If you are accessing from Rancher, then kubectl config's server URL is https:///k8s/clusters/ and needs to be suffixed with / k8s/clusters/

ClusterRole and ClusterRoleBinding

By default, HPA attempts to read (systematic and custom) metrics through the user's system:anonymous. Users need to define view-resource-metrics and view-custom-metrics, and ClusterRole and ClusterRoleBinding assign them to system:anonymous to open read access to metrics.

To achieve this, you need the following steps:

Configure kubectl to connect to the K8s cluster correctly

Copy the ClusterRole and ClusterRoleBinding files:

Create them on the Kubernetes cluster (if you want to use custom metrics):

Service deployment

In order for HPA to work properly, the service deployment should have the resource request definition of the container.

Let's use an example of hello-world to test whether HPA is working properly.

We follow the following steps. The first step is to correctly configure kubectl to connect to the K8s cluster, and the second step is to copy the deployment file of hello-world.

Deploy it on the k8s cluster

Copy HPA for resources or custom metrics

Resource indicators:

````apiVersion: autoscaling/v2beta1 kind: HorizontalPodAutoscaler metadata: name: hello-world namespace: default spec: scaleTargetRef: apiVersion: extensions/v1beta1 kind: Deployment name: hello-world minReplicas: 1 maxReplicas: 10 metrics:

Type: Resource resource:name: cpu targetAverageUtilization: 50

Type: Resource resource: name: memory targetAverageValue: 1000Mi

Custom metrics (same as resource metrics, but need to add custom cpu_system metrics)

Get the HPA information and description, and check that the resource metrics have been displayed:

Resource indicators:

Custom metrics:

Generate a load on our service and test for self-scaling. A variety of tools can be used here (to generate load), but here we use https://github.com/rakyll/hey to make http requests to the hello-world service and see if auto scaling works properly

Observe automatic expansion and reduction

Resource indicators:

When cpu utilization reaches the target ratio, it automatically extends to 2 pods

When the cpu usage reaches the target value horizontal-pod-autoscaler-upscale-delay defaults to more than 3 minutes, expand to 3 pods:

When the cpu usage drops to the target value horizontal-pod-autoscaler-downscale-delay exceeds 5 minutes by default, it shrinks to 1 pods:

Custom metrics:

Scale to 2 pods when cpu utilization reaches the target:

When the cpu_system utilization limit reaches the target value, expand to 3 pods:

When the cpu utilization limit reaches the target value horizontal-pod-autoscaler-upscale-delay of more than 3 minutes (default), expand to 4 pods:

When all metrics are lower than the target horizontal-pod-autoscaler-downscale-delay for more than 5 minutes, it is automatically reduced to 1 pods:

Thank you for reading this article carefully. I hope the article "how to configure Pod level automatic extension for Kubernetes" shared by the editor will be helpful to everyone. At the same time, I also hope you can support us and pay attention to the industry information channel. More related knowledge is waiting for you to learn!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.