grafana alerting 报警

使用工具:prometheusgrafanaprometheus_client_php

通过 prometheus-php-client 客户端暴露监控元信息,如下表示 order_notify 队列长度为90

1
2
3
# HELP payment_queue_length it sets
# TYPE payment_queue_length gauge
payment_queue_length{name="order_notify"} 90

被 prometheus 采集
2

配置邮件报警规则(间隔5分钟发送一次)
8

配置报警策略,关联邮件报警规则(每30m秒检查一次,报警触发后延迟1分钟后再通知,注意For参数和Conditions里的query时间单位要合理配置,一般设置相同。还要注意设置no data情况下的报警状态,防止当前时间点没有采集到数据报警)
7

在grafana中展示(图中设置了值超过100的报警规则)
3

手动修改队列长度为120,触发报警
4

收到报警邮件
5

解除报警
6

邮件报警配置

1
2
3
4
5
6
7
8
9
10
[smtp]
enabled = true
host = smtp.exmail.qq.com:465
user = system@exmail.com
# If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
password = ***********
;cert_file =
;key_file =
skip_verify = false
from_address = system@exmail.com

钉钉报警api

1
2
3
4
5
6
7
curl 'https://oapi.dingtalk.com/robot/send?access_token=762627b8d3fdfe3951dc***733e9e59ff59***7515c3' \
-H 'Content-Type: application/json' \
-d '{"msgtype": "text",
"text": {
"content": "业务报警测试"
}
}

ldap 配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
config: |-
[[servers]]
host = "10.0.0.1"
port = 389
use_ssl = false
start_tls = false
ssl_skip_verify = false
bind_dn = "uid=auth,ou=users,dc=apple,dc=com"
bind_password = "******"
search_filter = "(uid=%s)"
group_search_filter = "(&(objectClass=inetOrgPerson)(uid=%s))"
search_base_dns = ["ou=users,dc=apple,dc=com"]
[servers.attributes]
name = "givenName"
surname = "sn"
username = "uid"
email = "mail"