How to use custom indicators for automatic elastic scaling of K8S 07/16 Update SLTechnology News&Howtos

How to use custom indicators for automatic elastic scaling of K8S

2025-07-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)05/31 Report--

How to use custom indicators for K8S automatic elastic scaling, in view of this problem, this article introduces the corresponding analysis and solution in detail, hoping to help more partners who want to solve this problem to find a more simple and feasible method.

Kubernetes auto auto scaling can automatically increase or decrease services according to business traffic. This capability is very important in real-world business scenarios. We will see how Kubernetes can automatically scale for custom metrics generated by the application.

Why do you need custom metrics?

The CPU or RAM consumption of the application does not necessarily correctly indicate whether it needs to be extended. For example, if you have a message queue consumer, it can process 500 messages per second without causing a crash. Once a single instance of the consumer processes nearly 500 messages per second, you may want to extend the application to two instances to spread the load across the two instances. Measuring CPU or RAM is a bit overkill for extending applications like this, and you need to look for a metric that is more closely related to the nature of the application. The number of messages processed by an instance at a specific point in time can better reflect the actual load of the application. Similarly, there may be some applications of other indicators that are more meaningful. These can be defined using custom metrics in Kubernetes.

Metrics pipeline

Metrics Server and API

Initially, these metrics are exposed to users through Heapster, and Heapster can query metrics from each kubelet. Kubelet talks to cAdvisor on localhost and retrieves node-level and pod-level indicators. Metric-server was introduced to replace heapster and to use Kubernetes API to expose metrics to provide metrics in the form of Kubernetes API. Metric server only provides core metrics, such as pod and node memory and CPU. For other metrics, you need to build a complete metric pipeline. The mechanisms for building pipelines and Kubernetes auto-scaling will remain the same.

Aggregation Layer

One of the key parts of the metrics that can be exposed through the Kubernetes API layer is Aggregation Layer. This aggregation layer allows additional API in Kubernetes format to be installed in the cluster. This makes API as available as any Kubernetes resource, but the actual service of API can be done by an external service, possibly a Pod deployed to the cluster itself (if not done at the cluster level, you need to enable aggregation layer). So how does this work? As a user, the user needs to provide an API Provider (such as a pod running the API service), and then register the same API using the APIService object.

Let's take the core metric pipeline as an example to illustrate how metrics server registers itself using API Aggregation layer. The APIService object is as follows:

ApiVersion: apiregistration.k8s.io/v1kind: APIServicemetadata: name: v1beta1.metrics.k8s.iospec: service: name: metrics-server namespace: kube-system group: metrics.k8s.io version: v1beta1 insecureSkipTLSVerify: true groupPriorityMinimum: 100 versionPriority: 100

After deploying the metrics server that registers the API with APIService, we can see that the metric API is provided in Kubernetes API:

Metrics pipeline: core and complete pipeline

Now that we know the basic components, let's put them together to form a core metrics pipeline. In the core pipeline, if you have installed metrics server properly, it will also create an APIService to register itself with Kubernetes API server. As we learned in the previous section, these metrics will be exposed in / apis/metrics.k8s.io and used by HPA.

Most complex applications require more metrics, not just memory and CPU, which is why most enterprises use monitoring tools, such as Prometheus, Datadog, and Sysdig. Different tools use different formats. Before we can use Kubernetes API aggregation to expose endpoint, we need to convert the metrics to the appropriate format. At this point, you need to use a small adapter (adapter)-either as part of the monitoring tool or as a separate component that bridges the gap between the monitoring tool and the Kubernetes API. For example, Prometheus has a dedicated Prometheus adapter or Datadog has Datadog Cluster Agent-they are located between the monitoring tool and API and are converted from one format to another, as shown in the following figure. These indicators can be used in slightly different endpoint.

Automatic scaling of Demo:Kubernetes

We will demonstrate how to use custom metrics to automatically scale applications with Prometheus and Prometheus adapter.

Set up Prometheus

To enable the adapter to use metrics, we will use Prometheus Operator to install Prometheus. It creates a CRD to deploy the components of Prometheus in the cluster. CRD is a way to extend Kubernetes resources. Using Operator, you can easily configure and maintain Prometheus instances "in an Kubernetes way" (by defining objects in a YAML file). The CRD created by Prometheus Operator are:

AlertManager

ServiceMonitor

Prometheus

You can set up Prometheus according to the instructions in the link below:

Https://github.com/infracloudio/kubernetes-autoscaling#installing-prometheus-operator-and-prometheus

Deploy Demo applications

To generate metrics, we will deploy a simple application, mockmetrics, which will generate the total_hit_ count value at / metrics. This is a web server written in Go. When URL is accessed, the value of the metric total_hit_count continues to increase. It uses the presentation format required by Prometheus to display metrics.

Follow the link below to create a deployment and service for this application, which also creates a ServiceMonitor and HPA for the application:

Https://github.com/infracloudio/kubernetes-autoscaling#deploying-the-mockmetrics-application

ServiceMonitor

ServiceMonitor creates a configuration for Prometheus. It mentions the label, path, port of the service, and the time interval when metrics should be crawled. With the help of the service label, pods is selected. Prometheus grabs metrics from all matching Pod. Depending on your Prometheus configuration, ServiceMonitor should be placed in the appropriate namespace. In this case, it is in the same namespace as mockmetrics.

Deploy and configure Prometheus Adapter

Now that we are going to provide custom.metrics.k8s.io API endpoint for HPA, we will deploy Prometheus Adapter. Adapter wants its configuration file to be available in Pod. We will create a configMap and mount it inside the pod. We will also create Service and APIService to create API. APIService adds * * / api/custom.metrics.k8s.io/v1beta1 endpoint** to the standard Kubernetes APIs. You can achieve this goal according to the following tutorials:

Https://github.com/infracloudio/kubernetes-autoscaling#deploying-the-custom-metrics-api-server-prometheus-adapter

Next, let's look at the configuration:

SeriesQuery is used to query resources for Prometheus, labeled "default" and "mockmetrics-service".

The resources section mentions how tags are mapped to Kubernetes resources. In our case, it maps the "namespace" tag to the "namespace" of Kubernetes, as does the service.

MetricsQuery is another Prometheus query that imports metrics into adapter. The query we use is to get the average total_hit_count sum of all the pods matching regexmockmetrics-deploy- (. *) within 2 minutes.

Practice of automatic scaling of Kubernetes

Once you follow the steps below, the indicator value will continue to increase. Let's take a look at HPA now:

Https://github.com/infracloudio/kubernetes-autoscaling#scaling-the-application

$kubectl get hpa-wNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEmockmetrics-app-hpa Deployment/mockmetrics-deploy 0 * 1 11hmockmetrics-app-hpa Deployment/mockmetrics-deploy 90/100 1 10 2 11hmockmetrics-app-hpa Deployment/mockmetrics-deploy 126/100 1 10 2 11hmockmetrics-app-hpa Deployment/mockmetrics-deploy 306/100 1 10 2 11hmockmetrics-app-hpa Deployment/mockmetrics-deploy 171/100 1 10 4 11h

You can see how the number of copies increases when the value reaches the target value.

Work flow

The overall flow of automatic scaling is shown below:

Photo Source: luxas/kubeadm-workshop

This is the answer to the question on how to use custom indicators for K8S automatic elastic scaling. I hope the above content can be of some help to you. If you still have a lot of doubts to be solved, you can follow the industry information channel for more related knowledge.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.