How to deploy HPA for monitoring custom metrics 07/01 Update SLTechnology News&Howtos

How to deploy HPA for monitoring custom metrics

2025-07-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

This article introduces the relevant knowledge of "how to deploy HPA for monitoring custom indicators". In the operation of actual cases, many people will encounter such a dilemma, so let the editor lead you to learn how to deal with these situations. I hope you can read it carefully and be able to achieve something!

Introduction to HPA

HPA (Horizontal Pod Autoscaler) is a kind of resource object of kubernetes (hereinafter referred to as K8s), which can dynamically scale the number of pod in statefulSet, replicaController, replicaSet and other sets according to some indicators, so that the services running on it have a certain ability to adapt to the change of indicators.

HPA currently supports four types of metrics, namely Resource, Object, External and Pods. Dynamic scaling of CPU metrics is only supported in the stable version of autoscaling/v1, while dynamic scaling of memory and custom metrics is supported in the test version of autoscaling/v2beta2, and works in the autoscaling/v1 version as annotation. The structure of HPA in K8s

First of all, you can take a look at the structure of HPA in K8s. Here is an official example of HPA given by K8s. I give some comments on the key fields for easy understanding.

-the target value of type type: Pods pods: metric: name: packets-per-second # AverageValue. Only target values of type AverageValue are supported under the type of Pods metrics target: type: AverageValue averageValue: 1k # External type metrics-type: External external: metric: name: queue_messages_ready # this field is associated with third-party metric labels (there is a problem with the official document here, the correct way to write it is as follows) selector: matchLabels: env: "stage" app: "myapp" # External indicator type only supports target values of Value and AverageValue types target: type: AverageValue averageValue: 30

The autoscaling/v1 version puts the metrics field in annotation for processing.

There are three types of target: Utilization, Value, and AverageValue. Utilization stands for average usage, Value for naked value, and AverageValue for average.

The type field in metrics has four types of values: Object, Pods, Resource, and External.

Resource refers to the cpu and memory metrics of pod under the current scaling object. Only target values of Utilization and AverageValue types are supported.

Object refers to the metrics that specify internal objects in K8s. The data needs to be provided by a third-party adapter, and only target values of Value and AverageValue types are supported.

Pods refers to the metrics of Pods under scalable objects (statefulSet, replicaController, replicaSet). Data needs to be provided by third-party adapter, and only target values of AverageValue type are allowed.

External refers to indicators outside K8s. Data also needs to be provided by a third-party adapter, and only target values of Value and AverageValue types are supported.

The principle of dynamic scaling of HPA

HPA is also controlled by a controller in K8s. Controller will cycle the HPA at intervals to check whether the metrics monitored in each HPA trigger the scaling condition. The default interval is 15s. Once the scaling condition is triggered, controller sends a request to K8s to modify the fields in the scale sub-object of the scaling object (statefulSet, replicaController, replicaSet) that control the number of pod. K8s responds to the request, modifies the scale structure, and then refreshes the number of pod of the scaling object once. After the scaling object is modified, it will naturally increase or decrease the number of pod through the list/watch mechanism to achieve the purpose of dynamic scaling.

Description of HPA scaling process

The main process of HPA scaling is as follows:

1. Determine whether the current number of pod is within the range of pod set by HPA. If not, return the minimum value if too small, return the maximum value if too large, and end the scaling.

two。 Judge the type of index, and send the corresponding request to api server to get the set monitoring index. Generally speaking, indicators are obtained from the following three aggregated APIs according to pre-set indicators: metrics.k8s.io, custom.metrics.k8s.io, and external.metrics.k8s.io. Metrics.k8s.io is generally provided by metrics-server provided by K8s, which is mainly an indicator of cpu,memory utilization, while the other two need third-party adapter to provide. Custom.metrics.k8s.io provides custom metric data, which is generally related to K8s clusters, such as specific pod. External.metrics.k8s.io also provides custom metric data, but it generally has nothing to do with K8s cluster. Many well-known third-party monitoring platforms provide adapter to implement the above api (such as prometheus). Monitoring and adapter can be deployed together in K8s cluster to provide services, or even replace the original metrics-server to provide the above three types of api indicators to achieve the purpose of in-depth customization of monitoring data.

3. According to the obtained index, an expansion coefficient is calculated by using the corresponding algorithm, and the expected number of pod is obtained by multiplying the current number of pod. The coefficient is the ratio of the expected value of the indicator to the current value. If it is greater than 1, it will be expanded, and if it is less than 1, it will be reduced. There are three types of index values: average value (AverageValue), average utilization rate (Utilization) and naked value (Value), and each type of value has a corresponding algorithm. The following points are worth noting: if the coefficient has a decimal point, it is unified into one; if the coefficient does not reach a certain tolerance value, HPA thinks that the change is too small and will ignore the change. The tolerance value defaults to 0.1.

HPA expansion algorithm is a very conservative algorithm. If the metric is not available, the minimum value is calculated when the capacity is expanded and the maximum value is calculated when the capacity is reduced. If the average value is not ready for the pod, the denominator of the average number is not included in the pod.

A HPA supports the monitoring of multiple metrics. HPA will cycle through all metrics, calculate the expected number of pod, and obtain the maximum number of pod from the expected result as the final number of scaled pod. A scaling object is allowed to correspond to multiple HPA in K8s, but only K8s will not report an error. In fact, HPA does not know that it is monitoring the same scaling object, and the pod in this scaling object will be meaninglessly modified by multiple HPA back and forth to increase system consumption. If you want to specify multiple monitoring metrics, you can add multiple monitoring metrics to one HPA as mentioned above.

4. Check whether the final number of pod is within the range of the number of pod set by HPA, and will be changed to the maximum or minimum if it exceeds the maximum or below the minimum. Then send a request to K8s to modify the number of pod of the scale sub-object of the scaling object, end the check of one HPA, get the next HPA, and complete a scaling process.

Application scenarios of HPA

The characteristics of HPA combined with third-party monitoring applications make the services deployed on HPA scalable objects (statefulSet, replicaController, replicaSet) have very flexible adaptive ability, and can replicate multiple replicas to cope with the sharp surge of a certain indicator within a certain limit, or delete replicas in the case of a small indicator to release computing resources to other applications that need more resources, so as to maintain the stability of the entire system. It is very suitable for some business scenarios with large traffic fluctuations, tight machine resources and a large number of services, such as e-commerce services, ticket grabbing services, financial services and so on.

K8s-prometheus-adapter

As mentioned earlier, many monitoring systems implement interfaces through adapter to provide index data to HPA. Here we introduce the adapter of the prometheus monitoring system in detail.

Prometheus is a well-known open source monitoring system with many data dimensions, efficient storage, easy to use and so on. Users can customize the monitoring data they need through rich expressions and built-in functions.

Prometheus-adapter plays the role of adaptor in prometheus and api-server. Prometheus-adapter accepts the metric query request sent from HPA and transferred through apiserver aggregator, and then sends the corresponding request to prometheus to get the metric data according to the content, and returns it to HPA after processing. Prometheus can implement three api interfaces of metrics.k8s.io, custom.metrics.k8s.io and external.metrics.k8s.io at the same time, replacing the matrics-server of K8s to provide index data service.

The key to the success of prometheus-adapter deployment lies in whether the configuration file is correct. The required metrics can be set in the configuration file and how the metrics are handled. The following is a simple configuration file with comments for a little explanation.

# Metrics rules, where multiple rules can coexist. The result of the previous rule will be passed to the next rule rules: # expression for calculating metric data-metricsQuery: sum (rate ({} [5m])) by () # Metric renaming to support regular expressions. Here, delete the "_ seconds_total" name: as: "" matches: (. *) _ seconds_total$ # indicator from the metric name and associate the k8s resource with the tag Here, the metrics are associated with the namspace and pod of K8s through tags: resources: overrides: namespace: resource: namespace pod: resource: pod # filter metrics condition seriesFilters: [] # Metric query expression, you can filter specific metrics seriesQuery according to conditions such as tags:'{namespacetables = ", pods ="}'

The metricsQuery field is executed when K8s requests, where "" is the template syntax of go, Series represents the metric name, LabelMatchers represents the label key-value pair that matches the name of the K8s object, and GroupBy represents the signature of the numerical merging of the metric.

It is not easy to understand. Here is an example from the official document to illustrate it. After reading it, you should understand:

For instance, suppose we had a series http_requests_total (exposed ashttp_requests_per_second in the API) with labels service, pod,ingress, namespace, and verb. The first four correspond to Kubernetes resources. Then, if someone requested the metric pods/http_request_per_second for the pods pod1 and pod2 in the somens namespace, we'd have:- Series: "http_requests_total"-LabelMatchers: "pod=~\" pod1 | pod2 ", namespace=" somens "- GroupBy: pod

The resources field is an important field associated with the index of the K8s resource object, which is essentially matched with the name of the K8s resource object according to the label value of the index. The resources field tells the system which label value of the index should be used to match the name of the K8s resource object. There are two ways to associate resources, one is to bind a specific tag to a K8s resource object through overrides, and when K8s requests an indicator, it will compare the resource object name with this specific tag value to distinguish which object the indicator belongs to; the other is template, which converts the K8s resource object name into a tag signature and matches it through go language template syntax.

The second method, which is not easy to understand, also intercepts an example from the official document to illustrate:

# any label `kube__ `becomes. In Kubernetesresources:template: "kube__"

Deploy a HPA that monitors custom metrics

1. Deploy the prometheus application to make it work properly. You can use the official helm package for quick deployment, and subsequent applications are basically deployed in the form of helm package, which will not be discussed in detail.

two。 Deploy applications that need to scale. Here I chose a simple nginx service that was deployed as a deployment as a scaling object.

3. Deploy applications that provide custom metrics under the application's namespace. Here I chose the official prometheus-node-exporter and exposed the data port in the way of nodeport as the source of custom metric data. The application will run as daemonSet on each node of the K8s cluster and open up the metrics obtained on its own node.

The indicators exposed by node-exporter can already be seen in the interface of prometheus.

4. Deploy prometheus-adapter applications. Modify the configuration file in the values of the helm package as follows

Resources: template: seriesFilters: [] seriesQuery:'{namespaceful = ", podflowers ="}'- metricsQuery: sum (rate ({} [5m])) by () name: as:"matches: (. *) _ total$ resources: template: seriesFilters:-isNot: (. *) _ seconds_total$ seriesQuery:'{namespacehorse =" Podlines = ""}'- metricsQuery: sum ({}) by () name: as: "" matches: (. *) $resources: template: seriesFilters:-isNot: (. *) _ total$ seriesQuery:'{namespaceurs = "", pods = ""}'

The above configuration file calculates the rate of the metrics ending with secondstotal and _ total using the built-in function of prometheus, and removes the corresponding suffix as the metric name.

We use kubectl in K8s to see if the indicator can be obtained.

Kubectl get-raw / apis/custom.metrics.k8s.io/v1beta1/

You will see all the metric names you can get, but the names and data are different from the original metrics because we have made some changes in the adapter configuration file.

Then let's take a look at the node_cpu indicator and see if it can be displayed correctly.

Kubectl get-- raw / apis/custom.metrics.k8s.io/v1beta1/namespaces/my-nginx/pods/*/node_cpu | jq

This command can display the data of all the node_cpu metrics of pod under the namspace of my-nginx. The results are as follows

{"kind": "MetricValueList", "apiVersion": "custom.metrics.k8s.io/v1beta1", "metadata": {"selfLink": "/ apis/custom.metrics.k8s.io/v1beta1/namespaces/my-nginx/pods/%2A/node_cpu"}, "items": [{"describedObject": {"kind": "Pod", "namespace": "my-nginx" "name": "prometheus-node-exporter-b25zl", "apiVersion": "/ v1"}, "metricName": "node_cpu", "timestamp": "2019-10-29T03:33:47Z", "value": "3822m"}]}

Ok, at this point, shows that all the components are working properly, and hpa can get this indicator smoothly. It is important to note that HPA and monitoring objects must be deployed under the same namespace, otherwise the corresponding metrics will not be obtained.

5. Deploy hpa with the following yaml file

ApiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata: name: hpa-asdfvs namespace: my-nginxspec: scaleTargetRef: apiVersion: Deployment name: my-nginx minReplicas: 1 maxReplicas: 10 metrics:-type: Object object: metric: name: node_cpu describedObject: apiVersion: v1 kind: Pod name: prometheus-node-exporter-b25zl target: type: Value value: 9805m

We will dynamically scale our nginx application according to the node_cpu of prometheus-node-exporter, the pod.

Let's get this hpa.

Kubectl get horizontalPodAutoscaler-n my-nginx hpa-asdfvs-oyaml

ApiVersion: autoscaling/v1kind: HorizontalPodAutoscalermetadata: annotations: autoscaling.alpha.kubernetes.io/conditions: [{"type": "AbleToScale", "status": "True", "lastTransitionTime": "2019-10-29T02:54:50Z", "reason": "ReadyForNewScale", "message": "recommended size matches current size"}, {"type": "ScalingActive", "status": "True", "lastTransitionTime": "2019-10-29T03:05:24Z", "reason": "ValidMetricFound" "message": "the HPA was able to successfully calculate a replica count from Pod metric node_cpu"}, {"type": "ScalingLimited", "status": "False", "lastTransitionTime": "2019-10-29T02:54:50Z", "reason": "DesiredWithinRange", "message": "the desired count is within the acceptable range"}] 'autoscaling.alpha.kubernetes.io/current-metrics:' [{"type": "Object", "object": {"target": {"kind": "Pod" "name": "prometheus-node-exporter-b25zl", "apiVersion": "v1"}, "metricName": "node_cpu", "currentValue": "3822m"}] 'autoscaling.alpha.kubernetes.io/metrics:' [{"type": "Object", "object": {"target": {"kind": "Pod", "name": "prometheus-node-exporter-b25zl", "apiVersion": "v1"}, "metricName": "node_cpu" "targetValue": "9805m"}] 'kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion": "autoscaling/v2beta2", "kind": "HorizontalPodAutoscaler", "metadata": {"annotations": {}, "name": "hpa-asdfvs", "namespace": "my-nginx"}, "spec": {"maxReplicas": 10, "metrics": [{"object": {"describedObject": {"apiVersion": "v1", "kind": "Pod" "name": "prometheus-node-exporter-b25zl"}, "metric": {"name": "node_cpu"}, "target": {"type": "Value", "value": "9805m"}}, "type": "Object"}], "minReplicas": 1, "scaleTargetRef": {"apiVersion": "apps/v1beta1", "kind": "Deployment" "name": "my-nginx"}} creationTimestamp: "2019-10-29T02:54:45Z" name: hpa-asdfvs namespace: my-nginx resourceVersion: "164701" selfLink: "/ apis/autoscaling/v1/namespaces/my-nginx/horizontalpodautoscalers/hpa-asdfvs" uid: 76fa6a19-f9f7-11e9-8930-0242c5ccd054spec: maxReplicas: 10 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1beta1 kind: Deployment name: my-nginxstatus: currentReplicas: 1 desiredReplicas: 1 lastScaleTime: "2019-10-29T03:06:10Z"

You can see that hpa works in v1 version as annotation, and writes the matrics field to annotation. In the condition field of annotation, we can clearly see that HPA has obtained this metric. We can then try to reduce the target value to make the pod expand, or increase the target value to reduce the pod.

This is the end of the content of "how to deploy HPA for monitoring custom metrics". Thank you for reading. If you want to know more about the industry, you can follow the website, the editor will output more high-quality practical articles for you!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.