Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Prometheus AlertManager Wechat alarm configuration

2025-01-20 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Preparatory work:

Obtain the external interface of the enterprise

Enterprise × × secret_api

Enterprise Information ID corp_id

Wechat_api_url: wechat external interface https://qyapi.weixin.qq.com/cgi-bin/

Wechat_ * *: enterprise * * ("Enterprise Application"-- > "Custom Application" [Prometheus]-- > "Secret") Prometheus is the name of the self-created application.

Wechat_api_corp_id: enterprise Information ("my Enterprise"-> "CorpID" [at the bottom])

To_party: 1 value is the ID of the group. You can customize the recipient or group of alarm messages through the link (https://work.weixin.qq.com/ap...).

Agent_id: enterprise × × ("Enterprise Application"-- > "Custom Application" [Prometheus]-- > "AgentId") Prometheus is the name of the self-created application.

If the configuration files for prometheus and alertmanager are separate (not helm installation)

AlertManager configuration in Prometheus:

Alerting is at the same level as global.

# Alertmanager configuration

Alerting:

Alertmanagers:

-static_configs:

-targets:

-localhost:9093

Rules configuration file is added to Prometheus configuration file

Rule_files:

-"/ usr/local/prometheus/rules.yml"

Prometheus rules configuration

Create a rule.yml file

Add alarm rules according to demand

Groups:

-name: prometheus_go_goroutines

Rules:

-alert: go_goroutines_numbers

Expr: go_goroutines > 45

For: 15s

Annotations:

Summary: "prometheus's gorotine data exceeds 40!"

Prometheus AlertManager configuration

Alertmanager configuration file, add × × configuration information

Global:

Resolve_timeout: 2m

Wechat_api_url: 'https://qyapi.weixin.qq.com/cgi-bin/'

Wechat_ × ×: 'xxx'

Wechat_api_corp_id: 'xxx'

Route:

Group_by: ['alertname']

Group_wait: 10s

Group_interval: 10s

Repeat_interval: 1h

Receiver: 'wechat'

Receivers:

-name: 'wechat'

Wechat_configs:

-send_resolved: true

To_party:'1'

Agent_id: '1000002'

=

If you installed it with helm, then our configuration of promethues and alertmanager is in one file.

Vim prometheus-operator-custom.yaml # modify configuration

Alertmanager: next configuration

Config:

Global:

# check whether it is restored every 2 minutes

Resolve_timeout: 3m

Templates:

-'/ etc/alertmanager/config/template_wechat.tmpl'

Route:

# divide the incoming alarms with these tags into a group.

Group_by: ['wechat_alert']

# refers to how long it takes before a packet is created to send a compressed alert, that is, the delay of the first alarm.

# # this ensures that more alarms are compressed together on the first notification.

Group_wait: 15s

# when the first notification is sent, how long to wait for the compressed alert to be sent

Group_interval: 15s

# if the alarm is sent successfully, how often do you wait for it to be resent?

Repeat_interval: 3m

Receiver: 'wechat'

Routes:

-receiver: 'wechat'

Continue: true

Receivers:

-name: 'wechat'

Wechat_configs:

# whether to send a recovery alarm

-send_resolved: true

# × × official account ID

Corp_id: 'XXX'

# × × × Application key

× ×: 'XXX'

# multiple usernames can be sent

# to_user:'@ all'

# Department ID can see more hidden in the pop-up window in the lower right corner when you click on the department.

To_party: '92'

Agent_id: '1000010'

# template format:

TemplateFiles:

Template_wechat.tmpl: |-

{{define "wechat.default.message"}}

{{range .Alerts}}

= start=

Alarm program: k8s_prometheus_alert

Alarm level: {{.Labels.alarm}}

Alarm type: {{.Labels.alertname}}

Failed host: {{.Labels.name}}

Alarm threshold: {{.Annotations.value}}

Alarm topic: {{.Annotations.summary}}

# time defaults to UTC, so it takes an extra 8 hours to join 28800e9 later

Trigger time: {{(.StartsAt.Add 28800e9) .Format "2006-01-02 15:04:05"}}

= end=

{{end}}

{{end}}

Alarm rule profile:

Helm installation cooperates with file merging, so the alarm rule is independent of a file, and one more file can be loaded when loading.

Vim rules-custom.yaml # Editing rules file

AdditionalPrometheusRules:

-name: cpu1

Groups:

-name: cpu load

Rules:

-alert: pod cpu is more than 1%

Expr: (sum by (name) (rate (container_cpu_usage_seconds_total {imageframes = ""} [5m])) * 100) > 30

For: 1m

Labels:

Severity: critical

Annotations:

Value: "{{$value}}"

Description: The configuration of the instances of the Alertmanager cluster` {{$labels.service}} `are out of sync.

Summary: "this is the first test OK for the first group"

#-alert: pod memcache is more than 1%

# expr: (sum by (name) (rate (container_cpu_usage_seconds_total {imageframes = ""} [5m])) * 100) > 5

# for: 5m

# labels:

# severity: critical

# annotations:

# description: An unexpected number of Alertmanagers are scraped or Alertmanagers disappeared from discovery.

# summary: "this is the second test for the first group"

-name: cpu2

Groups:

-name: node load

Rules:

-alert: the other group pod exceeds 1%

Expr: (sum by (name) (rate (container_cpu_usage_seconds_total {imageframes = ""} [5m])) * 100) > 30

For: 1m

Labels:

Severity: critical

Annotations:

Value: "{{$value}}"

Summary: "this is the test ok for the second group"

Finally, when we load, we only need to load one more configuration file:

Two configurations can be loaded at the same time:

Helm upgrade monitoring stable/prometheus-operator-version=5.0.3-namespace=monitoring-f prometheus-operator-custom.yaml-f rules-custom.yaml

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report