Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Prometheus basic concept usage record

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/03 Report--

Prometheus Basic Concepts

Promethods is an open source framework for monitoring and alarming systems.

All monitoring data collected by Prometheus are stored in the built-in time series database (TSDB) in the form of metrics: data streams with timestamps belonging to the same metric name and label set. In addition to stored time series, Prometheus can also produce temporary, derived time series as return results based on query requests.

Features:

Powerful multi-for-data model Flexible query language Easy to manage Efficient use of pull mode to collect time series data Multiple visualizations Graphical interfaces Easy to scale

Prometheus composition and architecture:

prometheus server: Mainly responsible for data collection and storage, providing promQL query language support. Prometheus is a time-series database that stores the collected monitoring data in a time-series manner to a local disk. Push Gateway: An intermediate gateway that supports temporary job active push metrics. PromDash: Dashboard developed using rails for visualizing metrics data. Exporters: The HTTP interface responsible for monitoring machine health and providing information about monitored components is called exporter. Direct collection: exporter has built-in prometheus support, exposing data endpoints directly to prometheus. Indirect collection: Prometheus was not supported. Through prometheus clien library provided by the preparation of target monitoring collection procedures. Altermanager: After receiving alerts from Prometheus server, it will remove duplicate data, group it, route it to the receiving mode, and send an alarm. Common receiving methods are: email, pagerduty, OpsGenie, webhook, etc. WebUI:9090 provides graphical interface functionality.

basic working principle

Prometheus servers periodically pull metrics from configured jobs or exporters, or receive metrics from Pushgateway, or pull metrics from other Prometheus servers. Prometheus server stores collected metrics locally and runs defined alert.rules to record new time series or push alerts to AlertManager. Alertmanager processes the received alarm according to the configuration file and sends out an alarm. Visualize the collected data in the graphical interface.

Basic concepts:

Data model: The data stored in prometheus is a time series uniquely identified by the Metric name and a series of labels (key-value pairs), with different labels representing different time series.

Samples: Actual time series, each consisting of a float64 value and a millisecond timestamp. (metric + timestamp + sample value)

metric Name: semantic, indicating function: for example: http_requeststotal, indicating the total number of http requests. The metric name consists of ASCII characters, numbers, underscores, and colons, and must satisfy the regular expression [a-zA-Z:][a-zA-Z0-9_:]*.

Tag: Make a time series have different unread identifications. For example http_requeststotal{method="Get"} represents the Get request in all http requests. When method="post", it is a new metric. The keys in the label consist of ASCII characters, numbers, and underscores, and must satisfy the regular expression [a-zA-Z:][a-zA-Z0-9_:]*.

Format: {=, …}, e.g. http_requests_total{method="POST",endpoint="/api/tracks"}.

Metric type

counter: cumulative metrc.

Gauge: variable metric

Histogram: tree diagram

summary: summary

PromQL queries

data type

Instantaneous vector: A set of time series, each containing a single sample. Range vector: A set of time series, each containing sample data over a time range. scalar: A floating-point data value. String: A simple string value.

time series filter

Instantaneous vector filter: eg: http_requests_total, filters time series by a set of tags attached in {}. Label Match Cloud Operator: = : Select the label that is exactly the same as the string provided. != : Select a label that is different from the supplied string. =~ : Select the label whose regular expression matches the string (or substring) provided. !~ : Select labels whose regular expression does not match the string (or substring) provided. Interval vector filter: eg: http_requests_total{job="prometheus"}[5m], specify interval extraction value by []. Time unit: s-second m-minute h-hour d-day w-week y-year Time shift operation: In instantaneous vector expression or interval vector expression, the current time is taken as the reference.eg:http_requests_total offset 5m "offset keyword needs to follow the selector ({})"

operator

Arithmetic quadratic operator eg: addition subtraction multiplication division Boolean operator: eg: =,!=,

< , >

,= set operators: and, or, unless matching pattern

aggregation operation

syntax: ([parameter,] ) [without| by ()] only count_values, quantile, topk, bottomk support parameters (parameter)sum (sum);min (minimum);max (maximum);avg (mean);stddev (standard deviation);stdvar (standard difference);count (count);count_values (count value);bottomk (k elements with minimum sample value);topk (k elements with maximum sample value);quantile (distribution statistics)eg:([parameter,] ) [without| by ()]without is used to remove the enumerated labels from the calculation result while keeping the others. By, on the other hand, keeps only the labels listed in the result vector and removes the rest. Without and by allow you to aggregate data by sample question.

Tasks and examples

To collect different monitoring metrics, we need to run the corresponding monitoring collection program and let prometheus server know the access address of these export instances. Each http service monitoring a sample is called an instance. A node exporter can be called an instance.

A set of instances for the same collection purpose, or multiple copies of a collection process, is managed by a single task.

* job: node * instance 2: 1.2.3.4:9100 * instance 4: 5.6.7.8:9100

HTTP API response format

Instantaneous data query: url Request parameters: eg:'http://localhost:9090/api/v1/query? query=up&time=2015-07-01T20:10:51.781Z' query=: PromQL expression. time=: Specifies the timestamp used to calculate PromQL. Optional parameter that uses the current system time by default. timeout=: timeout setting. Optional parameter, global settings of-query,timeout are used by default Interval data query: url Request parameter: eg:'http://localhost:9090/api/v1/query_range'? query=up&start=2015-07-01T20:10:30.781Z&end=2015-07-01T20:11:00.781Z&step=15s'query=: PromQL expression. start=: Start time. end=: End time. step=: query step size. timeout=: timeout setting. Optional parameter, global setting of-query,timeout is used by default.

Prometheus alert

Alarm rule definition

Alarm Name: Custom Name.

Alarm rules: Define alarm trigger conditions based on PromQL expressions. Defined in configuration file

groups: - name: example rules: - alert: HighErrorRate expr: job:request_latency_seconds:mean5m{job="myjob"} > 0.5 for: 10m labels: severity: page annotations: summary: High request latency description: description info #group: define a set of related rules #alert: alert rule name #expr: trigger condition based on PromQL #for wait evaluation time #label custom label #annotation: specify a set of additional information Alertmanger property

Altermanager features

Grouping: Detailed alarm mechanisms can be combined into a single notification Suppression: When an alarm is issued, it can stop sending the alarm repeatedly Other alarm mechanisms triggered by this alarm Silence: Silence the alarm

Installation Start Altermanger

wget https://github.com/prometheus/alertmanager/releases/download/v0.15.3/alertmanager-0.15.3.linux-amd64.tar.gz cd alertmanager-0.15.3.linux-amd64/ ./ alertmanager

altermanager.yml profile introduction

global: resolve_timeout: 5m route: group_by: ['alertname'] group_wait: 10s group_interval: 10s repeat_interval: 1h receiver: 'web.hook' receivers: - name: 'web.hook' webhook_configs: - url: 'http://127.0.0.1:5001/' inhibit_rules: - source_match: severity: 'critical' target_match: severity: 'warning' equal: ['alertname', 'dev', 'instance'] routes and receivers. All alarm messages will enter the routing tree from the top-level route in the configuration, and the alarm messages will be sent to the corresponding receivers according to the routing rules. Global configuration: Used to define some global public parameters, such as global SMTP configuration, Slack configuration, etc.; template templates: templates used to define alarm notifications, such as HTML templates, mail templates, etc. alarm route: determine how the current alarm should be handled according to tag matching; recipient (receivers): receiver is an abstract concept, it can be an mailbox, WeChat, Slack or Webhook, etc., receivers are generally used in conjunction with alarm routing; inhibit_rules: reasonable setting of inhibition rules can reduce the generation of garbage alarms

Restart prometheus

killall -9 prometheusnohup prometheus &prometheus install

Install Prometheus Server

wget https://github.com/prometheus/prometheus/releases/download/v2.6.0/prometheus-2.6.0.linux-amd64.tar.gztar -zxvf prometheus-2.6.0.linux-amd64.tar.gzcd prometheus-2.6.0.linux-amd64./ prometheus &ln -s /root/prometheus/prometheus-2.6.0.linux-amd64/prometheus /usr/local/bin/prometheus settings Boot cat >> /usr/lib/systemd/system/multi-user.target.wants/prometheus.service

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report