Prometheus Operator, a more elegant monitoring tool for Kubernetes 07/06 Update SLTechnology News&Howtos

Prometheus Operator, a more elegant monitoring tool for Kubernetes

2025-07-06 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/03 Report--

Prometheus Operator, a more elegant monitoring tool for Kubernetes

[TOC]

1. Kubernetes Operator introduction

With the support of Kubernetes, it becomes easier to manage and scale Web applications, mobile application backends, and API services. The reason is that these applications are generally stateless, so basic Kubernetes API objects like Deployment can scale and recover applications without additional operations.

It is a challenge to manage stateful applications such as databases, caches or monitoring systems. These systems need application domain knowledge to scale and upgrade correctly and to effectively reconfigure when data is lost or unavailable. We hope that these application-related operation and maintenance skills can be coded into the software, so that with the help of the ability of Kubernetes, we can correctly run and manage complex applications.

Operator, a software that uses TPR (third-party resources, now upgraded to CRD) mechanism, extends Kubernetes API to incorporate application-specific knowledge into it, allowing users to create, configure, and manage applications. Like Kubernetes's built-in resources, Operator does not operate on a single-instance application, but on cluster-wide multiple instances.

2. Prometheus Operator introduction

Kubernetes's Prometheus Operator provides a simple monitoring definition for the deployment and management of Kubernetes services and Prometheus instances.

After installation, Prometheus Operator provides the following features:

Create / destroy: it is easier to start an instance of Prometheus in Kubernetes namespace, and it is easier for a particular application or team to use Operator. Simple configuration: configure the basics of Prometheus, such as the local resources versions, persistence, retention policies, and replicas in Kubernetes. Target Services via tags: automatically generates monitoring target configurations based on common Kubernetes label queries; there is no need to learn Prometheus-specific configuration language.

The Prometheus Operator architecture diagram is as follows:

The components of the above architecture run in the Kubernetes cluster as different resources, and they each have different roles:

Operator: Operator resources deploy and manage Prometheus Server according to custom resources (Custom Resource Definition / CRDs), and monitor the changes of these custom resource events to deal with them accordingly, which is the control center of the whole system.

Prometheus: the Prometheus resource is a declarative description of the expected state of the Prometheus deployment.

Prometheus Server: Operator is a Prometheus Server cluster deployed based on what is defined in the Prometheus type of custom resources, which can be thought of as StatefulSets resources for managing Prometheus Server clusters.

ServiceMonitor: ServiceMonitor is also a custom resource that describes a list of targets monitored by Prometheus. This resource selects the corresponding Service Endpoint through Labels, and lets Prometheus Server obtain Metrics information through the selected Service.

Service: the Service resource is mainly used to correspond to the Metrics Server Pod in the Kubernetes cluster to provide the ServiceMonitor with the selection for Prometheus Server to obtain information. To put it simply, it is the objects that Prometheus monitors, such as Node Exporter Service, Mysql Exporter Service, and so on.

Alertmanager: Alertmanager is also a custom resource type, and Operator deploys the Alertmanager cluster based on the resource description.

3. Prometheus Operator deployment

Environment:

Kubernetes version: 1.12helm version installed by kubeadm: v2.11.0

We use helm installation. Helm chart is modified according to actual use. Prometheus-operator

It integrates grafana and exporter for monitoring kubernetes. It is important to note that grafana I configured to use mysql to save data, as explained in another article, "using Helm to deploy Prometheus and Grafana to monitor Kubernetes."

Cd helm/prometheus-operator/helm install-name prometheus-operator-namespace monitoring-f values.yaml. /

In order to use Prometheus Operator more flexibly, adding custom monitoring is necessary. Here we use ceph-exporter as an example.

This paragraph in values.yaml uses servicemonitor to add monitoring:

ServiceMonitor: enabled: true # enable Monitoring # on what port are the metrics exposed by etcd exporterPort: 9128 # for apps that have deployed outside of the cluster, list their adresses here endpoints: [] # Are we talking http or https? Scheme: http # service selector label key to target ceph exporter pods serviceSelectorLabelKey: app # default rules are in templates/ceph-exporter.rules.yaml prometheusRules: {} # Custom Labels to be added to ServiceMonitor # after testing, the servicemonitor tag can be properly monitored by adding the release tag of prometheus operator: additionalServiceMonitorLabels: release: prometheus-operator # Custom Labels to be added to Prometheus Rules CRD additionalRulesLabels: {}

The most important thing is this parameter additionalServiceMonitorLabels. After testing, servicemonitor needs to add the existing tags of prometheus operator in order to successfully add monitoring.

[root@lab1 prometheus-operator] # kubectl get servicemonitor ceph-exporter- n monitoring-o yaml [root@lab1 templates] # kubectl get servicemonitor-n monitoring ceph-exporter- o yamlapiVersion: monitoring.coreos.com/v1kind: ServiceMonitormetadata: creationTimestamp: 2018-10-30T06:51:12Z generation: 1 labels: app: ceph-exporter chart: ceph-exporter-0.1.0 heritage: Tiller prometheus: ceph-exporter release: prometheus-operator name: ceph-exporter namespace: monitoring resourceVersion: "13937459" selfLink: / apis/monitoring.coreos.com/v1/namespaces/monitoring/servicemonitors/ceph-exporter uid: 30569173-dc10-11e8-bcf3-000c293d66a5spec: endpoints:-interval: 30s port: http namespaceSelector: matchNames:-monitoring selector: matchLabels: app: ceph-exporter release: ceph-exporter [root@lab1 prometheus-operator] # kubectl get pod-n monitoring prometheus-operator-operator-7459848949-8dddt-o yaml | moreapiVersion: v1kind: Podmetadata: creationTimestamp: 2018-10-30T00:39:37Z GenerateName: prometheus-operator-operator-7459848949- labels: app: prometheus-operator-operator chart: prometheus-operator-0.1.6 heritage: Tiller pod-template-hash: "745984894 release: prometheus-operator

Main points:

The tag of ServiceMonitor must at least match the tag in prometheus-operator POD; the spec parameter service of ServiceMonitor can be accessed by prometheus, and all endpoints are normal; if you encounter problems, you can open the debug log of prometheus operator and prometheus. Although the log has no other information, the prometheus operator debug log can see the currently monitored servicemonitor to confirm that the installed servicemonitor is matched.

After the installation is successful, check the related resources:

[root@lab1 prometheus-operator] # kubectl get service,servicemonitor,ep-n monitoringNAME TYPE CLUSTER-IP EXTERNAL-IP PORT (S) AGEservice/alertmanager-operated ClusterIP None 9093/TCP 6783/TCP 12dservice/ceph-exporter ClusterIP 10.100.57.62 9128/TCP 46hservice/monitoring-mysql-mysql ClusterIP 10.108.93.155 3306/TCP 42dservice/prometheus-operated ClusterIP None 9090/TCP 12dservice/prometheus-operator-alertmanager ClusterIP 10.98.42.209 9093/TCP 6d19hservice/prometheus-operator-grafana ClusterIP 10.103.100.150 80/TCP 6d19hservice/prometheus-operator-kube-state-metrics ClusterIP 10.110.76.250 8080/TCP 6d19hservice/prometheus-operator-operator ClusterIP None 8080/TCP 6d19hservice/prometheus-operator-prometheus ClusterIP 10.111.24.83 9090/TCP 6d19hservice/prometheus-operator-prometheus-node-exporter ClusterIP 10.97.126.74 9100/TCP 6d19hNAME AGEservicemonitor.monitoring.coreos.com/ceph-exporter 1dservicemonitor.monitoring.coreos.com/prometheus-operator 8dservicemonitor.monitoring.coreos.com/prometheus-operator-alertmanager 6dservicemonitor.monitoring.coreos.com/prometheus-operator-apiserver 6dservicemonitor.monitoring.coreos.com/prometheus-operator-coredns 6dservicemonitor.monitoring.coreos.com/prometheus-operator-kube-controller-manager 6dservicemonitor.monitoring.coreos.com/prometheus-operator-kube-etcd 6dservicemonitor.monitoring.coreos.com/prometheus-operator-kube-scheduler 6dservicemonitor.monitoring.coreos.com/prometheus-operator-kube-state-metrics 6dservicemonitor.monitoring.coreos.com/prometheus-operator-kubelet 6dservicemonitor.monitoring.coreos.com/prometheus-operator-node-exporter 6dservicemonitor.monitoring.coreos.com/prometheus-operator-operator 6dservicemonitor.monitoring.coreos.com/prometheus-operator-prometheus 6dNAME ENDPOINTS AGEendpoints/alertmanager-operated 10.244.6.174:9093,10.244.6.174:6783 12dendpoints/ceph-exporter 10.244.2.59:9128 46hendpoints/monitoring-mysql-mysql 10.244.6.171:3306 42dendpoints/prometheus-operated 10.244.2.60:9090,10.244.6.175:9090 12dendpoints/prometheus-operator-alertmanager 10.244.6.174:9093 6d19hendpoints/prometheus-operator-grafana 10.244.6.106:3000 6d19hendpoints/prometheus-operator-kube-state-metrics 10.244.2.163:8080 6d19hendpoints/prometheus-operator-operator 10.244.6.113:8080 6d19hendpoints/prometheus-operator-prometheus 10.244.2.60:9090,10.244.6.175:9090 6d19hendpoints/prometheus-operator-prometheus-node-exporter 192.168.105.92 9100192.168.105.93 9100192.168.105.94 more... 6d19h4. Grafana add dashboard

The _ dashboards in the above prometheus-operator has my modified dashboard, which is more comprehensive. It can be imported manually in the grafana interface, and the dashboard can be modified at will later, which is very convenient in the process of use. If you put the dashboard json file in the dashboards directory and install helm, the installed dashboard does not support direct modification in grafana, which is troublesome in the process of use.

5. Alertmanager add alarm

To add a prometheusrule, here is an example:

[root@lab1 ceph-exporter] # kubectl get prometheusrule-n monitoring ceph-exporter- o yaml apiVersion: monitoring.coreos.com/v1kind: PrometheusRulemetadata: creationTimestamp: 2018-10-30T06:51:12Z generation: 1 labels: app: prometheus chart: ceph-exporter-0.1.0 heritage: Tiller prometheus: ceph-exporter release: ceph-exporter name: ceph-exporter namespace: monitoring resourceVersion: "13965150" selfLink: / apis/monitoring.coreos.com/v1/namespaces/monitoring/prometheusrules/ceph- Exporter uid: 30543ec9-dc10-11e8-bcf3-000c293d66a5spec: groups:-name: ceph-exporter.rules rules:-alert: Ceph annotations: description: There is no running ceph exporter. Summary: Ceph exporter is down expr: absent (up {job= "ceph-exporter"} = = 1) for: 5m labels: severity: critical

The rule for monitoring k8s by default is already very comprehensive, and you can adjust the prometheus-operator/templates/all-prometheus-rules.yaml on your own.

Alarm rules can be modified in values.yaml alertmanager: the following paragraph

Config: global: resolve_timeout: 5m # The smarthost and SMTP sender used for mail notifications. Smtp_smarthost: 'smtp.163.com:25' smtp_from:' xxxxxx@163.com' smtp_auth_username: 'xxxxxx@163.com' smtp_auth_password:' xxxxxx' # The API URL to use for Slack notifications. Slack_api_url: 'https://hooks.slack.com/services/some/api/token' route: group_by: ["job" "alertname"] group_wait: 30s group_interval: 5m repeat_interval: 12h receiver: 'noemail' routes:-match: severity: critical receiver: critical_email_alert-match_re: alertname: "^ KubeJob*" receiver: default_email receivers:-name:' default_email' email_configs: -to: 'xxxxxx@163.com' send_resolved: true-name:' critical_email_alert' email_configs:-to: 'xxxxxx@163.com' send_resolved: true-name:' noemail' email_configs:-to: 'null@null.cn' send_resolved: false # # Alertmanager template files to format alerts # # ref: https://prometheus.io/docs/alerting/notifications/ # # https://prometheus.io/docs/alerting/notification_examples/ # # templateFiles: template_1.tmpl:-{{define "cluster"} {{.ExternalURL | reReplaceAll ". * alertmanager\. (. *)" $1 "}} {{end} {{define" slack.k8s.text "} {{- $URL: =. -}} {{range .Alerts}} * Alert:* {{.Annotations.summary}-`{{.Labels.statements}}` * Cluster:* {{template "cluster" $root}} * Description:* {Annotations.description}} * Graph:* * Runbook:* * Details:* {{range .Labels.SortedPairs} * {{.Name}}: * `{{.Value}}` {{end}} 6. Summary

By defining servicemonitor and prometheusrule, Prometheus Operator can dynamically adjust the configuration of prometheus and alertmanager, which is more in line with the operating habits of Kubernetes and makes Kubernetes monitoring more elegant.

Reference:

[1] https://www.kancloud.cn/huyipow/prometheus/527093

[2] https://coreos.com/blog/introducing-operators.html

[3] https://coreos.com/blog/the-prometheus-operator.html

[4] https://github.com/coreos/prometheus-operator

[5] https://prometheus.io/docs/introduction/overview/

[6] https://prometheus.io/docs/alerting/alertmanager/

[7] https://github.com/1046102779/prometheus

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.