Kubernetes1.16 self-scaling based on Prometheus custom metrics 07/12 Update SLTechnology News&Howtos

Kubernetes1.16 self-scaling based on Prometheus custom metrics

2025-07-12 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

HPA principle

K8S auto scaling refers to the automatic capacity expansion operation when the current container resources, such as cpu, memory, custom metrics, etc., have exceeded the normal setting range in the case of large volume of business. The service is loaded on the expanded container to reduce the container pressure until the upper limit of the number of containers set by HPA is reached. When the business volume is reduced, the automatic capacity reduction operation is realized. Metrics Server in Kubernetes continuously collects metric data for all Pod replicas. The HPA controller obtains these data through Metrics Server's API (Heapster's API or aggregate API), and calculates based on the user-defined expansion rules to get the number of target Pod replicas. When the number of target Pod replicas is different from the current one, the HPA controller initiates the scale operation to the Pod replica controller (Deployment, RC or ReplicaSet) to adjust the number of Pod replicas to complete the expansion and reduction operation.

HPA has capacity expansion and cooling cycle to prevent unnecessary expansion and reduction operations caused by network jitter.

Expansion cooling time default value: 3 minutes

Downsizing cooling time default: 5 minutes

You can set the cooling time by adjusting the startup parameters of the kube-controller-manager component:

-- horizontal-pod-autoscaler-downscale-delay: capacity expansion cooling

-- horizontal-pod-autoscaler-upscale-delay: reducing capacity and cooling

At the beginning of its development, HPA only supports memory, while cpu automatically expands and reduces capacity, but does not support custom metrics. Up to now, it has been able to interact with monitoring services such as Prometheus through third-party plug-ins, obtain custom metrics, and convert them into content that apiserver can understand, so that HPA can carry out automatic capacity expansion operations. The figure below is as follows

The development process of HPA

At present, HPA already supports three major versions: autoscaling/v1, autoscaling/v2beta1 and autoscaling/v2beta2.

At present, most people are familiar with autoscaling/v1, and this version only supports the elastic scaling of CPU.

While autoscaling/v2beta1 added support for custom metrics, autoscaling/v2beta2 added additional support for external metrics.

What these changes have to mention is the understanding and transformation of monitoring and monitoring indicators in the Kubernetes community. From the early Heapster to Metrics Server and then to the division of index boundaries, it has been enriching the monitoring ecology.

Traditional HPA only aims at cpu, and the method of memory expansion and reduction is also very simple. You can set the threshold by either yaml or kubectl command line.

V1 version:

ApiVersion: autoscaling/v1

Kind: HorizontalPodAutoscaler

Metadata:

Name: php-apache

Namespace: default

Spec:

ScaleTargetRef:

ApiVersion: apps/v1

Kind: Deployment

Name: php-apache

MinReplicas: 1

MaxReplicas: 10

TargetCPUUtilizationPercentage: 50

V2beta2 version:

ApiVersion: autoscaling/v2beta2

Kind: HorizontalPodAutoscaler

Metadata:

Name: php-apache

Namespace: default

Spec:

ScaleTargetRef:

ApiVersion: apps/v1

Kind: Deployment

Name: php-apache

MinReplicas: 1

MaxReplicas: 10

Metrics:

-type: Resource

Resource:

Name: cpu

Target:

Type: Utilization

AverageUtilization: 50

-type: Pods

Pods:

Metric:

Name: packets-per-second

Target:

Type: AverageValue

AverageValue: 1k

-type: Object

Object:

Metric:

Name: requests-per-second

DescribedObject:

ApiVersion: networking.k8s.io/v1beta1

Kind: Ingress

Name: main-route

Target:

Type: Value

Value: 10k

-type: External

External:

Metric:

Name: queue_messages_ready

Selector: "queue=worker_tasks"

Target:

Type: AverageValue

AverageValue: 30

The type field in metrics has four types of values: Object, Pods, Resource, and External.

Resource: refers to the cpu and memory metrics of pod under the current scaling object. Only target values of Utilization and AverageValue are supported.

Object: refers to the metrics for specifying internal objects in K8s. The data needs to be provided by a third-party adapter, and only target values of Value and AverageValue types are supported.

Pods: refers to the metric of the scalable object Pods. The data needs to be provided by a third-party adapter. Only target values of type AverageValue are allowed.

External: refers to metrics outside K8s. Data also needs to be provided by a third-party adapter. Only target values of Value and AverageValue types are supported.

Custom metric scaling based on Prometheus

1. Deploy Prometheus

Just refer to my previous blog deployment.

two。 Deploy Prometheus Adapter

However, the metrics collected by prometheus can not be directly used by K8s, because the two data formats are not compatible, and another component (k8s-prometheus-adpater) is needed to convert the metrics data format of prometheus into a format that can be recognized by the K8s API interface. after conversion, because it is a custom API, it is also necessary to register in the main APIServer with Kubernetes aggregator in order to access it directly through / apis/.

Install directly using Helm Charts

# wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz

# tar zxvf helm-v3.0.0-linux-amd64.tar.gz

# mv linux-amd64/helm / usr/bin/

# helm repo add stable http://mirror.azure.cn/kubernetes/charts

# helm repo update

# helm repo list

# helm install prometheus-adapter stable/prometheus-adapter-- namespace kube-system-- set prometheus.url= http://prometheus.kube-system,prometheus.port=9090

# helm list-n kube-system

Make sure the adapter adapter is registered to apiserver

# kubectl get apiservices | grep custom

# kubectl get-- raw "/ apis/custom.metrics.k8s.io/v1beta1"

3. Create metrics deployment

The container needs to expose the port that allows prometheus to obtain data, the metrics path, the need to negotiate with the developer, and the indicators that need to be monitored when the service is exposed are placed under the / metrics path.

Note: the image is private. If you need to experiment, you need to replace it.

ApiVersion: apps/v1

Kind: Deployment

Metadata:

Labels:

App: metrics-app

Name: metrics-app

Spec:

Replicas: 3

Selector:

MatchLabels:

App: metrics-app

Template:

Metadata:

Labels:

App: metrics-app

Annotations:

Prometheus.io/scrape: "true"

Prometheus.io/port: "80"

Prometheus.io/path: "/ metrics"

Spec:

Containers:

-image: 172.30.0.109/metrics-app

Name: metrics-app

Ports:

-name: web

ContainerPort: 80

Resources:

Requests:

Cpu: 200m

Memory: 256Mi

ReadinessProbe:

HttpGet:

Path: /

Port: 80

InitialDelaySeconds: 3

PeriodSeconds: 5

LivenessProbe:

HttpGet:

Path: /

Port: 80

InitialDelaySeconds: 3

PeriodSeconds: 5

ApiVersion: v1

Kind: Service

Metadata:

Name: metrics-app

Labels:

App: metrics-app

Spec:

Ports:

-name: web

Port: 80

TargetPort: 80

Selector:

App: metrics-app

Access the container / metrics path to get the total http visits, which are converted to QPS by prometheus metrics

4. Create a HPA rule

The capacity expansion operation is performed for container QPS. Pods type only supports setting the maximum capacity expansion number of averageValue,deployment to 10.

HPA yaml:

ApiVersion: autoscaling/v2beta2

Kind: HorizontalPodAutoscaler

Metadata:

Name: metrics-app-hpa

Namespace: default

Spec:

ScaleTargetRef:

ApiVersion: apps/v1

Kind: Deployment

Name: metrics-app

MinReplicas: 1

MaxReplicas: 10

Metrics:

-type: Pods

Pods:

Metric:

Name: http_requests_per_second

Target:

Type: AverageValue

AverageValue: 800m # 800m or 0.8p / s

After configuring HPA, hpa cannot get the data of http_requests_per_second. The acquisition metric seen by kubetcl get hpa is unknown, so you need to configure http_requests_per_second in the prometheus adapter adapter.

It is equivalent to a whitelist to get the metric data in prometheus.

Modify prometheus adapter configmap

# kubectl edit cm-n kube-system prometheus-adapter

New:

-seriesQuery: 'http_requests_total {Kubernetesnaming namespace = "", Kubernetesgiving podding names = ""}'

Resources:

Overrides:

Kubernetes_namespace: {resource: "namespace"}

Kubernetes_pod_name: {resource: "pod"}

Name:

Matches: "^ (. *) _ total"

As: "${1} _ per_second"

MetricsQuery: 'sum (rate ({} [2m])) by ()'

Here is mainly in the form of promSQL by converting http_requests_total into the average number of visits of an interval, such as the last two minutes, to find the sum of all the containers, that is, QPS

Note: replace the http_requests_total name in / metrics with http_requests_per_second,HPA to obtain metric data according to http_requests_per_second for capacity expansion operation

After the prometheus-adapter configuration is modified, you need to restart the container to load the new configuration.

At this point, the indicators have been obtained.

# kubectl get hpa

5. Test HPA

Carry out pressure test

# yum install-y httpd-tools

Use the ab command to stress test metrics-app service

Check HPA. The indicator data is full, and the capacity of the containers under deployment has been expanded to 10. After the stress test, the capacity will be reduced automatically.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.