Automatic capacity expansion / reduction of Kubernetes Advanced pod 07/09 Update SLTechnology News&Howtos

Automatic capacity expansion / reduction of Kubernetes Advanced pod

2025-07-09 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Table of contents:

Practice 1: capacity expansion and reduction based on autoscaling cpu Index

Practice 2: capacity expansion and reduction based on prometheus custom index QPS

Pod automatic capacity expansion / reduction (HPA) Horizontal Pod Autoscaler (automatic HPA,Pod horizontal scaling) automatically adjusts replication controller, deployment or replica set according to resource utilization or custom metrics to achieve automatic expansion and reduction of deployment, making the deployment scale close to the actual service load. HPA is not suitable for objects that cannot be scaled, such as DaemonSet. HPA is mainly a calculation of pod resources, increasing or decreasing the current number of copies. HPA is like this. We need to create a rule for hpa, and set such a rule to expand or reduce the capacity of pod, mainly for deployment. When the resource utilization you set exceeds your preset value, it will help you expand and reduce these replicas.

1. The basic principles of HPA

Metrics Server in Kubernetes continuously collects metric data for all Pod replicas. The HPA controller obtains these data through Metrics Server's API (Heapster's API or aggregate API), and calculates based on the user-defined expansion rules to get the number of target Pod replicas. When the number of target Pod replicas is different from the current one, the HPA controller initiates the scale operation to the Pod replica controller (Deployment, RC or ReplicaSet) to adjust the number of Pod replicas to complete the expansion and reduction operation. As shown in the figure.

In elastic scaling, the cooling cycle is an unavoidable topic, and because the metric of evaluation is dynamic, the number of copies may fluctuate constantly. It is sometimes called bumpy, so what is the cooling time after each expansion and reduction?

Let's take a look at this diagram. First, hpa needs to create a rule, just like the rule we created before, in which we define a range of capacity expansion and reduction, and then specify a good object and specify its expected value. Hpa itself is a controller, a cyclic controller, and it will continue to obtain this index from metrics server to determine whether the predicted value reaches the pre-value of the rule you set, if so. Will execute this scale to help you expand this copy. If it is in a low utilization rate for a long time, it will help you scale down this copy. The source of resources for this metrics server comes from cadvisor to get. Think about the indicators that cadvisor can provide, and what hpa can get, such as cpu, memory utilization, mainly collect your utilization, so hpa has already supported elastic scaling of CPU in the early days.

Hpa is a controller for the horizontal expansion of pod in K8s, but to implement

For the expansion of Pod, he needs certain conditions. He needs to take a certain index. There is a pre-value here. He wants to judge whether your index exceeds this pre-value and scale you down and expand your capacity. So in order to get this index, you also need to install a component, metrics server. Before, uh, the implementation of this component has been gradually abandoned by heapster heapstar, and basically does not use heapstat much. So metrics server provides these data and provides the utilization of these resources.

For example, there are three replicas to achieve the scaling of three pod, so there is something to determine the resource utilization. For example, based on cpu, calculate the resource utilization of three pod. For example, the resource utilization of three pod is 50%. After you get this value, you need to use this hpa. This will define a pre-value, there is a rule, this pre-value is set to 60%, it will periodically match with the cpu. Then expand the capacity. Here is the definition of the number of pod replicas to be expanded.

For example, the number of visits to my group of pod is abnormal. For example, the cpu of some individuals is more than 50%. If there is no limit on the maximum size of the replica, it will increase infinitely. The first 3 replicas may increase by 10 copies or even 50, so the whole cluster can be dragged down very quickly. So set 3 indicators in the hpa, the first setting, the interval value of pod. How many pod can be expanded at the maximum, how many pod can be minimum, for example, 1-10, a minimum can be created, and a maximum of 10 can be created. This is a range value for capacity reduction and expansion. Then there is a pre-value judgment. The third is which object to operate and which group of pod to judge. These three are all judged in hpa.

Under what circumstances, then, to reduce and expand capacity?

Capacity expansion means that resources are not enough and more than 60%, but in this interval, it is transformed by two states, one is capacity expansion, the other is capacity reduction, these two states are like the current resource utilization rate of 60%, and the capacity is expanded to 10 replicas, which is no problem, and then this value comes down immediately, before it is 70% and 80%. Now it is 20%, from 10 to 5 directly, these two are stateful transitions, so hpa has to guarantee that the interval frequency of impossible state transitions is too high. If it is too high, there will be periods of good and bad, intermittent bursts, not always sudden, it will lead to expansion and contraction, and finally lead to instability of the application, so hpa has a cooling mechanism, after the first expansion. If you want to expand capacity for the second time, you have to go through this cooling time. The default is 3 minutes. After the first time, you have to wait 5 minutes for the second time. This is a default value to ensure the stability of your current business. It is set through the startup parameters of the kube-controller-manager component.

In HPA, the default cooling period for capacity expansion is 3 minutes, and the cooling period for capacity reduction is 5 minutes.

You can set the cooling time by adjusting the startup parameters of the kube-controller-manager component:

-- horizontal-pod-autoscaler-downscale-delay: capacity expansion cooling-- horizontal-pod-autoscaler-upscale-delay: capacity reduction cooling

2. The evolution of HPA.

At present, HPA already supports three major versions: autoscaling/v1, autoscaling/v2beta1 and autoscaling/v2beta2.

At present, most people are familiar with autoscaling/v1, and this version only supports the elastic scaling of CPU.

This is also relatively simple. Create a rule and use the collected components.

Autoscaling/v2beta1 has added support for custom metrics. In addition to the metrics exposed by cadvisor, it also supports custom metrics, such as QPS provided by third parties, or expansion based on other resources, that is, some components of third parties are supported.

Autoscaling/v2beta2 adds additional support for external indicators.

What we have to mention about these changes is the understanding and change of monitoring and monitoring indicators in Kubernetes community. From the early Heapster to Metrics Server and then to the division of index boundaries, it has been enriching the monitoring ecology.

Therefore, with the changes in these indicators, K8s has changed its understanding of monitoring, because the gold content in K8s is still relatively high, and there was no good solution before, so it is not applied much in this area, and then go to the community to improve this area, and now the application is also gradually increasing.

Example

V1 version is a limitation of a cpu. In fact, the memory was also opened in the early days, but only for cpu later, because only cpu was exposed, because exposing memory is not an indicator of using auto scaling, because, like a memory, there are some applications to manage, such as java, memory is managed by a jvm, so using an indicator of memory expansion is not very good. Therefore, only the index of cpu is exposed, and cpu is relatively accurate. After the utilization of pod,cpu comes up, the load is high and the traffic is high, so this piece of reference value is relatively large, so only use cpu, and specify a percentage here.

ApiVersion: autoscaling/v1kind: HorizontalPodAutoscalermetadata: name: php-apache namespace: defaultspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: php-apache minReplicas: 1 maxReplicas: 10 targetCPUUtilizationPercentage: 50:

V2beta2 version

It supports a lot of custom things, such as resource,pods.object,external

You can do some work on pod exposure indicators according to cpu, and you can also target third-party indicators, such as message queues.

V2 supports more.

2.5 scaling based on CPU metrics

1 、 Kubernetes API Aggregation

The aggregation layer was introduced in Kubernetes 1.7, which allows third-party applications to access and manipulate the new API through API Server's HTTP URL by registering themselves with kube-apiserver. To implement this mechanism, Kubernetes introduces an API aggregation layer (API Aggregation Layer) into the kube-apiserver service to forward access requests from the extended API to the function of the user service.

To use the v1 version to scale based on cpu metrics, you first need to enable API aggregation, which is introduced in the k8s1.7 version. It is introduced to allow third-party applications to register in api, so that this component can be called when accessing the api. This diagram, first of all, api aggregation is enabled from APIserver, above as APIserver, below as a component, APIserver itself is behind the aggregation layer at the later end. It's just that it's all implemented in this. In fact, apiserver is behind the aggregation layer, and you can treat the aggregation layer as a proxy layer, just like nginx proxy web, where you can have multiple proxies, not limited to apiserver. Like metrics server, a component developed by yourself can also register, and then let him proxy. After that, it can access api, thus accessing this component. Then the aggregation layer is like the requested url to help you forward to the following components, and all the subsequent components will correspond to the api. According to the api registered with the aggregation layer, it is simply to extend the function of api, that is, to facilitate the development of components, integrate them into this api, and call your components just like calling api. In fact, this is the purpose of an aggregation layer. If you deploy kubeadm in K8s, The default aggregation layer is already enabled, if it is deployed in a binary way, then start the aggregation layer in a binary way, which starts according to your own environment, because everyone deploys different binaries.

You need to add startup parameters to kube-APIServer by adding the following parameters

Vi / opt/kubernetes/cfg/kube-apiserver.conf...--requestheader-client-ca-file=/opt/kubernetes/ssl/ca.pem\-proxy-client-cert-file=/opt/kubernetes/ssl/server.pem\-proxy-client-key-file=/opt/kubernetes/ssl/server-key.pem\-requestheader-allowed-names=kubernetes\-requestheader-extra-headers-prefix=X-Remote-Extra-\-requestheader-group-headers=X-Remote-Group\-requestheader-username- Headers=X-Remote-User\-enable-aggregator-routing=true\...

The root certificate specified in the first line must have certain authentication to access the aggregation layer. Not everyone can access it, and there is also a certain security mechanism, that is, a trusted ca.

Second, the three-line proxy is the client's certificate, which roughly means that it is put on the aggregation layer for authentication to determine whether you have a mechanism to access it.

The fourth line is the allowed name, which is in the certificate provided to determine whether the name of the segment can be accessed. I am using the apiserver certificate, or I can use ca to generate a new certificate for this.

On the fifth line, the request header determines whether it is accessible or not

The sixth line is to start the routing at the aggregation layer

Restart this field

[root@k8s-master1 ~] # vim / opt/kubernetes/cfg/kube-apiserver.conf

Add the configuration of the start aggregation layer into

[root@k8s-master1 ~] # systemctl restart kube-apiserver [root@k8s-master1 ~] # ps-ef | grep kube-apiserver

2. Deploy Metrics Server

Metrics Server is a cluster-wide data aggregator for resource usage. Deployed in a cluster as an application.

Metric server collects metrics from the summary API exposed by Kubelet on each node.

Metrics server is registered in Master APIServer through the Kubernetes aggregator.

It must be able to get the utilization rate of cpu. Only with this can it be compared. Do you want to expand the capacity? so deploy a metrics server to the cluster and let it provide CPU data query for hpa. Metrics server is equivalent to an aggregator. Its data is cadvisor data. It aggregates the data of each node of that data. This is what it says to do, because you want to expand capacity, not a copy to expand capacity. Do not refer to a copy of the indicators, to refer to your current pod should be considered, and each has cadvisor, then access to the current pod, only access to the current utilization, and cadvisor does not have any aggregate effect, if you want to do a summary of the resource utilization of all pod, then there is this metrics server on it. Before there was heapstar to do it, now it is done by metrics server to help you summarize. Then hpa judges the domain based on the overall cpu utilization from this summary information, so metrics server is an aggregator.

Moreover, these metrics will be collected from the kubelet on each node and registered in the apiserver in K8s through the K8s aggregator, so the metrics must start the aggregator, so the metrics can actively register itself, after which it can request metrics with a name exposed by metrics, and will forward it to the pod of the later metrics according to the name you carry.

Git clone https://github.com/kubernetes-incubator/metrics-servercd metrics-server/deploy/1.8+/vi metrics-server-deployment.yaml # adds 2 startup parameters and modifies the modified ones. The default is the one requested by https. We are the direct https request api image: zhaocheng172/metrics-server-amd64:v0.3.1 command:-/ metrics-server- kubelet-insecure-tls-kubelet-preferred-address-types=InternalIP imagePullPolicy: Always volumeMounts:-name: tmp-dir mountPath: / tmp nodeSelector: beta.kubernetes.io/os: linux

[root@k8s-master1 1.8 +] # kubectl apply-f.

Make sure pod gets up.

[root@k8s-master1 1.8 +] # kubectl get pod-n kube-systemNAME READY STATUS RESTARTS AGEcoredns-59fb8d54d6-7rmx2 1 43mkube-flannel-ds-amd64-gcf9s 1 Running 0 43 mkubemuri 9f9vq 1 Running 0 43 mkubemuramd64-4jjmm 1 43mkube-flannel-ds-amd64-gcf9s 1 Running 0 1/1 Running 0 43mmetrics-server-64499fd8c6-xkc6c 1/1 Running 0 61s

Deploy to see if metrics-server is working properly, check if pod has an error log, and then see if it is registered to the aggregation layer

Check whether you have registered with kubectl get apiservers. It is only normal if it is true here.

[root@k8s-master1 1.8 +] # kubectl get apiservicesv1beta1.metrics.k8s.io kube-system/metrics-server True 19s

Then check kubectl top node to see the utilization of node resources

[root@k8s-master1 1.8 +] # kubectl top nodeNAME CPU (cores) CPU% MEMORY (bytes) MEMORY% k8s-master1 517m 25% 1021Mi 62% k8s-node1 994m 49% 551Mi 33% k8s-node2 428m 10 2466Mi 32%

You can also check the resource utilization of pod through kubectl top pod

[root@k8s-master1 1.8 +] # kubectl top pod-n kube-systemNAME CPU (cores) MEMORY (bytes) coredns-59fb8d54d6-7rmx2 13m 14Mi kube-flannel-ds-amd64- 4jjmm 15m 23Mi kube-flannel-ds-amd64- 9f9vq 7m 15Mi kube-flannel-ds-amd64-gcf9s 9m 15Mi metrics-server-64499fd8c6-xkc6c 3m 14Mi /

You can also obtain indicators of resource utilization through the identification of metrics api, such as container cpu and memory utilization. These metrics can be accessed directly by users, through kubectl top commands, or by the controller pod autoscaler in the cluster for viewing. Hpa obtains this resource utilization through an interface, and the interface it requests is api, so you can also obtain these data according to api.

Test: you can get this data, which is the same as what top sees, except that this is displayed through api, but only through json.

Kubectl get-raw / apis/metrics.k8s.io/v1beta1/nodes [root@k8s-master1 1.8 +] # kubectl get-raw / apis/metrics.k8s.io/v1beta1/nodes {"kind": "NodeMetricsList", "apiVersion": "metrics.k8s.io/v1beta1", "metadata": {"selfLink": "/ apis/metrics.k8s.io/v1beta1/nodes"}, "items": [{"metadata": {"name": "k8s-master1" "selfLink": "/ apis/metrics.k8s.io/v1beta1/nodes/k8s-master1", "creationTimestamp": "2019-12-12T03:45:06Z"}, "timestamp": "2019-12-12T03:45:03Z", "window": "30s", "usage": {"cpu": "443295529n", "memory": "1044064Ki"}, {"metadata": {"name": "k8s-node1", "selfLink": "/ apis/metrics.k8s.io/v1beta1/nodes/k8s-node1" "creationTimestamp": "2019-12-12T03:45:06Z"}, "timestamp": "2019-12-12T03:45:00Z", "window": "30s", "usage": {"cpu": "285582752n", "memory": "565676Ki"}, {"metadata": {"name": "k8s-node2", "selfLink": "/ apis/metrics.k8s.io/v1beta1/nodes/k8s-node2", "creationTimestamp": "2019-12-12T03:45:06Z"} "timestamp": "2019-12-12T03:45:01Z", "window": "30s", "usage": {"cpu": "425912654n", "memory": "2524648Ki"}]}

Convert the format of command line output to json format-jq command

[root@k8s-master1 1.8 +] # kubectl get-- raw / apis/metrics.k8s.io/v1beta1/nodes | jq

3. Autoscaing/v1 (cpu indicator practice)

The autoscaling/v1 version supports only one metric CPU.

First deploy an application and point out a service. We will test and test the flow pressure test later. The cpu reaches the predetermined value of 60%, and then the replica will be automatically expanded. If the traffic comes down, then the capacity will be reduced automatically.

[root@k8s-master1 hpa] # cat app.yaml apiVersion: apps/v1kind: Deploymentmetadata: creationTimestamp: null labels: app: nginxspec: replicas: 5 selector: matchLabels: app: nginx strategy: {} template: metadata: creationTimestamp: null labels: app: nginxspec: containers:-image: nginx name: nginx resources: requests: Cpu: 90m memory: 90Mi---apiVersion: v1kind: Servicemetadata: name: nginxspec: selector: app: nginx ports:-protocol: TCP port: 80 targetPort: 80

Create a HPA policy

Generate with command

Kubectl autoscale-help

Here you can see the command used to filter the empty fields with-o yaml output-dry-run

Kubectl autoscale deployment foo-- min=2-- max=10 kubectl autoscale deployment nginx-- min=2-- max=10-o yaml-- dry-run > hpa-v1.yaml [root@k8s-master1 hpa] # cat hpa-v1.yaml apiVersion: autoscaling/v1kind: HorizontalPodAutoscalermetadata: name: nginxspec: maxReplicas: 6 minReplicas: 3 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx targetCPUUtilizationPercentage: 60scaleTargetRef: indicate who the current scaling object is targetCPUUtilizationPercentage: when the overall resource utilization exceeds 60%, the capacity will be expanded. Maxreplicas: the maximum number of replicas to be expanded Minreplicas: the minimum number of replicas to be reduced

Check the expansion status

[root@k8s-master1 hpa] # kubectl get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEnginx Deployment/nginx 0% Acme 60% 3 6 3 52m

Turn on the pressure test to test our cluster IP

[root@k8s-master1 hpa] # kubectl get svcNAME TYPE CLUSTER-IP EXTERNAL-IP PORT (S) AGEkubernetes ClusterIP 10.0.0.1 443/TCP 5h30mnginx ClusterIP 10.0.0.211 80/TCP 48m

Install the pressure test command

Yum install httpd-tools-y [root@k8s-master1 hpa] # ab-n 1000000-c 10000 http://10.0.0.211/index.html

The test cpu has successfully exceeded the expected value.

[root@k8s-master1 hpa] # kubectl get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEnginx Deployment/nginx 148% Acme 60% 3 6 6 56m

The maximum number of copies is 6, and now it has been successfully expanded automatically.

[root@k8s-master1 hpa] # kubectl get pod NAME READY STATUS RESTARTS AGEnginx-969bfd4c9-g4zkc 1 34mnginx-969bfd4c9-hlcmc 1 Running 0 34mnginx-969bfd4c9-hlcmc 1 Running 0 51snginx-969bfd4c9-mn2rd 1 52mnginx-969bfd4c9-rk752 1 Running 0 52mnginx-969bfd4c9-rk752 1 34mnginx-969bfd4c9-zmmd8 0 34mnginx-969bfd4c9-zmmd8 1 Running 0 51 snginx- 969bfd4c9-zz5gp 1/1 Running 0 51s

Turn off the pressure test, and it will be reduced automatically after about 5 minutes.

[root@k8s-master1 hpa] # kubectl get podNAME READY STATUS RESTARTS AGEnginx-969bfd4c9-g4zkc 1 39mnginx-969bfd4c9-mn2rd 1 Running 0 39mnginx-969bfd4c9-mn2rd 1 Running 0 57mnginx-969bfd4c9-rk752 1 Running 039m workflow: hpa-> apiserver-> kube aggregation-> metrics-server-> kubelet (cadvisor)

4. Autoscaling/v2beta2 (multiple indicators)

To meet more needs, HPA also has two versions, autoscaling/v2beta1 and autoscaling/v2beta2.

The difference between the two versions is that autoscaling/v1beta1 supports Resource Metrics (CPU) and Custom Metrics (Application Metrics), while additional External Metrics support is added to the autoscaling/v2beta2 version.

In order to meet more requirements, hpa also has two versions: v2beat1 and v2beat2. This span is also relatively large, and this can achieve custom indicators.

[root@k8s-master1 hpa] # kubectl get hpa.v2beta2.autoscaling-o yaml > hpa-v2.yamlapiVersion: v1itemshpa-v2.yamlapiVersion-apiVersion: autoscaling/v2beta2 kind: HorizontalPodAutoscaler metadata: name: nginx namespace: default spec: maxReplicas: 6 minReplicas: 3 metrics:-resource: name: cpu target: averageUtilization: 60 type: Utilization type: Resource scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: nginx

The effect is the same as the above v1 version, except that the format has changed here.

V2 also supports other types of metrics: Pods and Object.

Type: Podspods: metric: name: packets-per-second target: type: AverageValueaverageValue: 1ktype: Objectobject: metric: name: requests-per-second describedObject: apiVersion: networking.k8s.io/v1beta1 kind: Ingress name: main-route target: type: Value value: 2k

The type field in metrics has four types of values: Object, Pods, Resource, and External.

Resource: refers to the cpu and memory metrics of pod under the current scaling object. Only target values of Utilization and AverageValue are supported.

Object: refers to the metrics for specifying internal objects in K8s. The data needs to be provided by a third-party adapter, and only target values of Value and AverageValue types are supported.

Pods: refers to the metric of the scalable object Pods. The data needs to be provided by a third-party adapter. Only target values of type AverageValue are allowed. In addition, there are indicators of pod exposure, such as the number of requests and throughput of http, which is exposed by http itself, but it is exposed that it cannot get these indicators, and it also needs to rely on some third-party monitoring, that is, to judge the face value of hpa here. This advance is to register through kubectl apiservices and go to the aggregation layer, and all hpa are obtained through the aggregation layer. In fact, it is the requested api to the aggregation layer, and then the aggregation layer proxies you to the back components, such as metics-service, which goes to get it for you, and then each kubelet helps you collect (cadvisor the resource utilization of each pod, which is exposed through the aggregator, and then queries the set pod resource utilization, and makes an average, so that you can get the target value later through hpa. Then hpa will help you judge whether or not to reach this expected value, and if so, help you expand your capacity.

Based on the instance of pod, the index exposed by pod itself, compare throughput, qps, if the target is 1k, it will also trigger

Hpa-> apiserver- > agg- > aggregation layer-> prometheus-adapter then it registers in the aggregation layer. Prometheus itself is a monitoring system, which can collect all the indicators exposed by pod, store them by itself, and show that adapter mainly registers itself into the aggregation layer and it can transform the corresponding data interface of the monitoring indicator apiserver is different from the interface of prometheus. The key to the existence of adapter here is the conversion of data format. Docking is not simply prometheus or other monitoring systems, in order to achieve custom indicators to complete the data conversion and registration, and then prometheus will show each pod

External: refers to metrics outside K8s. Data also needs to be provided by a third-party adapter. Only target values of Value and AverageValue types are supported.

Workflow: hpa-> apiserver-> kube aggregation-> prometheus-adapter-> prometheus- > pods

2.6 Custom Metric scaling based on Prometheus

Resource metrics include only CPU and memory, which is generally sufficient. However, if you want to implement HPA based on custom metrics, such as the number of qps/5xx errors requested, you need to use custom metrics. The more mature implementation is Prometheus Custom Metrics. The custom metrics are provided by Prometheus and then aggregated into apiserver using k8s-prometheus-adpater to achieve the same effect as the core metrics (metric-server).

Resources generally include cpu and memory. Like public clouds, they are all implemented based on cpu and memory conservative dimensions, but we may have some special requirements, such as the QPS of web service requests, or this set of data provided by this group of web. This is also a very common requirement, but there are not many of these requirements, and its implementation is not that simple. At present, a very mature solution is to use prometheus to define these indicators.

Its general process is that if api adapter is registered in apiserver, you can see it through apiservice, then go to prometheus, and then get these metrics from pod.

1. Deploy Prometheus

Prometheus (Prometheus) is a monitoring system originally built on SoundCloud. It has been a community open source project since 2012, with a very active community of developers and users. To emphasize open source and independent maintenance, Prometheus joined the Cloud Native Cloud Computing Foundation (CNCF) in 2016, becoming the second managed project after Kubernetes.

Prometheus features:

Multidimensional data model: time series data identified by measurement name and key-value pairs

PromSQL: a flexible query language that can use multidimensional data to complete complex queries

Independent of distributed storage, a single server node can work directly

Collecting time series data by pull based on HTTP

Push time series data is supported by PushGateway component

Discover targets through service discovery or static configuration

Multiple graphics modes and dashboard support (grafana)

Composition and structure of Prometheus

Prometheus Server: collect metrics and store time series data, and provide query interface

ClientLibrary: client library

Push Gateway: short-term storage of metric data. Mainly used for temporary tasks

Exporters: collect existing monitoring metrics of third-party services and expose metrics

Alertmanager: alarm

Web UI: a simple Web console

Deployment:

There are no details on the deployment of prometheus here. If necessary, please refer to my previous article.

Here, you only need to deploy a server, and you can get the data of pod. There is also an automatic supply of pv, so I have deployed it in advance. I have written all the previous articles, and I will not do too many demonstrations here.

[root@k8s-master1 prometheus] # kubectl get pod Svc-n kube-systemNAME READY STATUS RESTARTS AGEpod/coredns-59fb8d54d6-7rmx2 1 9f9vq 1 Running 0 25hpod/grafana-0 1 Running 0 18mpod/kube-flannel-ds-amd64-4jjmm 1 4jjmm 1 Running 0 25hpod/kube-flannel-ds-amd64-9f9vq 1 Running 0 25hpod/kube-flannel-ds-amd64-gcf9s 1 AGEservice 1 Running 0 25hpod/metrics-server-64499fd8c6-xkc6c 1 23mNAME TYPE CLUSTER-IP EXTERNAL-IP PORT 1 Running 0 24hpod/prometheus-0 2 23mNAME TYPE CLUSTER-IP EXTERNAL-IP PORT (S) / grafana NodePort 10.0.0.233 80:30007/TCP 18mservice/kube-dns ClusterIP 10.0.0.2 53/UDP 53/TCP 25hservice/metrics-server ClusterIP 10.0.0.67 443/TCP 24hservice/prometheus NodePort 10.0.0.115 9090:30090/TCP 23m

Visit 30090 of my prometheus, where I did not deploy the node_experiod components on the node node, so I did not collect this. Because this is not needed, my previous article deployment has been described in detail, and I will not do this here.

3. Practice based on QPS indicators.

Deploy an application:

[root@k8s-master1 hpa] # cat hpa-qps.yaml apiVersion: apps/v1kind: Deploymentmetadata: labels: app: metrics-app name: metrics-appspec: replicas: 3 selector: matchLabels: app: metrics-app template: metadata: labels: app: metrics-app annotations: prometheus.io/scrape: "true" prometheus.io/port: "80" prometheus.io/path: "/ metrics" spec: containers:-image: zhaocheng172/metrics-app name: metrics-app ports:-name: web containerPort: 80 resources: requests: cpu: 200m memory: 256Mi readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 3 PeriodSeconds: 5 livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 3 periodSeconds: 5---apiVersion: v1kind: Servicemetadata: name: metrics-app labels: app: metrics-appspec: ports:-name: web port: 80 targetPort: 80 selector: app: metrics-app

The metrics-app exposes a Prometheus metrics interface, which can be seen by visiting service:

[root@k8s-master1 hpa] # curl 10.0.0.107/metricsHELP http_requests_total The amount of requests in totalTYPE http_requests_total counter

Http_requests_total 31 . This is the total number of visits and cumulative values requested by my pod.

HELP http_requests_per_second The amount of requests per second the latest ten seconds

TYPE http_requests_per_second gauge

Http_requests_per_second 0.5. this is the throughput within 10 seconds, that is, there are 0.5 http requests. Throughput and qps are different concepts. Qps refers to a quantity within a range, but they are both an indicator of current quantification.

Now go and see if prometheus has requested data from those three pod.

Through http_requests_total, you can see the name_app: metrics-app that we defined in yaml

Before, if you also want to be collected by prometheus, then you need to expose this indicator, which is one, and the other is to collect this indicator. All pod data is realized through the annotations in yaml.

App: metrics-app

Annotations:

Prometheus.io/scrape: "true". So that it can collect.

Prometheus.io/port: "80". The address that accesses it is url.

Prometheus.io/path: "/ metrics". The default is metrics.

The fields of these three annotations all begin with prometheus, so take this indicator away and monitor it.

Prometheus will scan the pod in K8s for exposure indicators, and if so, it will automatically join the monitored end, and then it will be exposed, so this is what prometheus can sense and monitor in the automatic discovery of K8s

Now it is collected, and then the custom metrics adapter is deployed.

Deploy Custom Metrics Adapter

However, the metrics collected by prometheus can not be directly used by K8s, because the two data formats are not compatible, and another component (k8s-prometheus-adpater) is needed to convert the metrics data format of prometheus into a format that can be recognized by the K8s API interface. after conversion, because it is a custom API, it is also necessary to register in the main APIServer with Kubernetes aggregator in order to access it directly through / apis/.

The main function is to register yourself with api-server, and the second is to convert it into data that api can recognize.

Https://github.com/DirectXMan12/k8s-prometheus-adapter

The PrometheusAdapter has a stable Helm Charts that we use directly.

Prepare the helm environment first:

[root@k8s-master1 helm] # wget https://get.helm.sh/helm-v3.0.0-linux-amd64.tar.gz[root@k8s-master1 helm] # tar xf helm-v3.0.0-linux-amd64.tar.gz [root@k8s-master1 helm] # mv linux-amd64/helm / usr/bin

Now you can use helm, install helm, and configure a helm repository

That is, it stores the adapter in this warehouse.

If you add it, it is recommended to use Microsoft Cloud's adapter

[root@k8s-master1 helm] # helm repo add stable http://mirror.azure.cn/kubernetes/charts"stable" has been added to your repositories [root@k8s-master1 helm] # helm repo lsNAME URL stable http://mirror.azure.cn/kubernetes/charts

In this way, we can use helm install and install adapter.

Because the chart of adapter, such as the address where it wants to connect to prometheus, that is, the default chart cannot be used before, and it has to be changed, so specify its address and port and replace the default variable of chart directly with the set command.

To deploy prometheus-adapter, specify the prometheus address:

[root@k8s-master1 helm] # helm install prometheus-adapter stable/prometheus-adapter-- namespace kube-system-- set prometheus.url= http://prometheus.kube-system, Prometheus.port=9090NAME: prometheus-adapterLAST DEPLOYED: Fri Dec 13 15:22:42 2019NAMESPACE: kube-systemSTATUS: deployedREVISION: 1TEST SUITE: NoneNOTES:prometheus-adapter has been deployed.In a few minutes you should be able to list metrics using the following command (s): kubectl get-- raw / apis/custom.metrics.k8s.io/v1beta1 [root@k8s-master1 helm] # helm list-n kube-systemNAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSIONprometheus-adapter kube-system 1 2019-12-13 15 22 purl 42.043441232 + 0800 CST deployed prometheus-adapter-1.4.0 v0.5.0

Check that pod has been deployed successfully

[root@k8s-master1 helm] # kubectl get pod-n kube-systemNAME READY STATUS RESTARTS AGEcoredns-59fb8d54d6-7rmx2 1 Running 0 28hgrafana-0 1 3h40mkube-flannel-ds-amd64 1 Running 0 3h40mkube-flannel-ds-amd64-4jjmm 1 Running 0 28hkube-flannel -ds-amd64- 9f9vq 1 + 1 Running 0 28hkube-flannel-ds-amd64-gcf9s 1 + 1 + 1 Running 0 28hmetrics-server-64499fd8c6-xkc6c 1 + + 1 Running 0 27hprometheus-0 2 + + 2 Running 0 3h45mprometheus-adapter-77b7b4dd8b-9rv26 1 + + 1 Running 0 2m36s

Check to determine whether pod is working properly. It is already registered to the aggregation layer.

[root@k8s-master1 helm] # kubectl get apiservicev1beta1.custom.metrics.k8s.io kube-system/prometheus-adapter True 13m

So that you can test whether the interface can be used by a raw url.

[root@k8s-master1 helm] # kubectl get-- raw "/ apis/custom.metrics.k8s.io/v1beta1" | jq

Create a hpa policy

[root@k8s-master1 hpa] # cat hpa-v5.yaml apiVersion: autoscaling/v2beta2kind: HorizontalPodAutoscalermetadata: name: metrics-app-hpa namespace: defaultspec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: metrics-app minReplicas: 1 maxReplicas: 10 metrics:-type: Pods pods: metric: name: http_requests_per_second target: type: AverageValue averageValue: 800m

Check the status of successful creation. Currently, you haven't got this value. Because the adapter doesn't know what metrics you want (http_requests_per_second), HPA can't get the metrics provided by Pod.

ConfigMap edits prometheus-adapter in the default namespace and seriesQuery adds a new one at the top of the rules: section:

[root@k8s-master1 hpa] # kubectl get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEmetrics-app-hpa Deployment/QPS / 800m 3 60 16mnginx Deployment/nginx 0% Universe 60% 3 6 3 24 hours

Add a new field to collect the value of the QPS we want to implement

[root@k8s-master1 hpa] # kubectl edit cm prometheus-adapter-n kube-system

Put this piece under rules, and the middle one is actually promsql, this can be executed, and the result is the same as our previous output. {it indicates that the field is not empty, more accurate}, this is mainly to query a series of data, the following paragraph is a pipe connection, associate ns and pod to get the data, all are corresponding relations.

As before, except that the previous value is the cumulative value of http access, and now we want to convert it to a rate. The value of QPS is added by how many metrics are collected within the range, and then divided by 60 seconds, which is the value per second. Matches: "^ (. *) _ total", also raise your hand to match the previous value to replace, to provide the value of as: "${1} _ per_second", that is, the value of QPS. Use this value to provide an interface for http

Rules:-seriesQuery: 'http_requests_total {Kubernetesnaming namespace = "" KubernetesThe as: ""} resources: overrides: kubernetes_namespace: {resource: "namespace"} kubernetes_pod_name: {resource: "pod"} name: matches: "^ (. *) _ total" as: "${1} _ per_second" metricsQuery: 'sum (rate ({} [2m])) by ()'

This value is an average of it, that is, the average number of 0.42http requests executed closest to each execution within 2 minutes.

Rate (http_requests_total {Kubernetesnaming namespace = "", Kubernetesroompod naming = ""} [2m])

Because we have multiple pod, we need to add and provide an indicator, and then we give a by and a tag to query.

Sum (rate (http_requests_total {Kubernetesroomnamespace = "", Kubernetesroompod naming = ""} [2m])

Use by to define the name of the tag for easy query

Sum (rate (http_requests_total {Kubernetesroomnamespace = "", Kubernetesroompod naming = ""} [2m])) by (kubernetes_pod_name)

Test apikubectl get-- raw "/ apis/custom.metrics.k8s.io/v1beta1/namespaces/default/pods/*/http_requests_per_second"

We have received our value so far.

NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEmetrics-app-hpa Deployment/metrics-app 416m/800m 1 10 3 2m13snginx Deployment/nginx 0% Acme 60% 3 6 3 25h

Pressure test

Kubectl get svcmetrics-app ClusterIP 10.0.0.107 80/TCP 3h25mab-n 100000-c 100 http://10.0.0.107/metrics

Check the expansion status

[root@k8s-master1 hpa] # kubectl get hpaNAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGEmetrics-app-hpa Deployment/metrics-app 414m/200m 1 10 10 8m36snginx Deployment/nginx 0% Universe 60% 36 3 26h [root@k8s-master1 hpa] # kubectl get podNAME READY STATUS RESTARTS AGEmetrics-app-b96659c9c-5jxsg 1/1 Running 0 3m53smetrics-app-b96659c9c-5lqpb 1/1 Running 0 5m24smetrics-app-b96659c9c-6qx2p 1/1 Running 0 2m21smetrics-app-b96659c9c-bqkbk 1/1 Running 0 3m53smetrics-app-b96659c9c-f5vcf 1/1 Running 0 2m21smetrics-app-b96659c9c-j24xn 1/1 Running 1 3h22mmetrics-app-b96659c9c-vpl4t 1/1 Running 0 3h22mmetrics-app-b96659c9c-wxp7z 1/1 Running 0 3m52smetrics-app-b96659c9c-xztqz 1/1 Running 0 3m53smetrics-app-b96659c9c-zhq5r 1/1 Running 0 5m24snfs-client-provisioner-6f54fc894d-dbvmk 1/1 Running 0 5h50mnginx-969bfd4c9-g4zkc 1/1 Running 0 25hnginx-969bfd4c9-mn2rd 1/1 Running 0 25hnginx-969bfd4c9-rk752 1/1 Running 0 25h

Wait for a while and then the copy will be reduced in about 5 minutes.

Summary:

Collect the http_request_total metrics of each Pod through / metrics; prometheus summarizes the collected information; APIServer regularly queries from Prometheus to obtain request_per_second data; HPA periodically queries APIServer to determine whether it conforms to the configured autoscaler rules; if it meets the autoscaler rules, modify the number of ReplicaSet copies of Deployment to scale.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.