git地址 钉钉 参考地址 参考地址2 邮件
https://blog.csdn.net/weixin_44953658/article/details/112520122https://www.cnblogs.com/xuwujing/p/14065740.html

成功案例 邮件

本次实验环境 windows alertmanager-0.21.0.windows-amd64.tar.gz
地址:127.0.0.1:9093

  • 下载解压配置 Alertmanager
    • Alertmanager.yaml ```yaml global: resolve_timeout: 5m smtp_from: ‘1445763190@qq.com’ smtp_smarthost: ‘smtp.qq.com:465’ smtp_auth_username: ‘1445763190@qq.com’ smtp_auth_password: ‘2342342342342’ smtp_require_tls: false route: group_by: [‘instance’] group_wait: 30s group_interval: 5m repeat_interval: 3h receiver: email routes:
      • match: severity: critical receiver: pager
      • match_re: severity: ^(warning|critical)$ receiver: support_team

receivers:

  • name: ‘email’ email_configs:
    • to: ‘1445763190@qq.com’
  • name: ‘support_team’ email_configs:
    • to: ‘1445763190@qq.com’
  • name: ‘pager’ email_configs:
    • to: ‘1445763190@qq.com’
  1. - 配置 prometheus
  2. - prometheus.yaml
  3. ```yaml
  4. # my global config
  5. global:
  6. scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  7. evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  8. # scrape_timeout is set to the global default (10s).
  9. # Alertmanager configuration
  10. alerting:
  11. alertmanagers:
  12. - static_configs:
  13. - targets:
  14. - '192.168.0.51:9093'
  15. # Alertmanager 的地址
  16. # Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
  17. rule_files:
  18. - "first_rules.yml"
  19. # - "second_rules.yml"
  20. # A scrape configuration containing exactly one endpoint to scrape:
  21. # Here it's Prometheus itself.
  22. scrape_configs:
  23. # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  24. - job_name: 'prometheus'
  25. # metrics_path defaults to '/metrics'
  26. # scheme defaults to 'http'.
  27. static_configs:
  28. - targets: ['localhost:9090']
  29. - job_name: 'appointment'
  30. scrape_interval: 10s
  31. metrics_path: '/appointment/actuator/prometheus'
  32. static_configs:
  33. - targets: ['localhost:9002']
  34. labels:
  35. instance: smartAppointment
  • first_rules.yml

    image.png

  1. groups:
  2. - name: node
  3. rules:
  4. - alert: server_status
  5. expr: up{job="appointment"} == 0
  6. for: 15s
  7. annotations:
  8. summary: "机器{{ $labels.instance }} 挂了"
  9. description: "报告.请立即查看!"

测试中钉钉

alertmanager-0.21.0.linux-amd64.tar.gz CentOS Linux release 7.6.1810 (Core)

  • Architecture: x86_64
  • Intel(R) Xeon(R)
  • 下载解压配置 Alertmanager
  • 钉钉Alertmanager.yaml ```yaml global: resolve_timeout: 5m route: receiver: webhook group_wait: 30s group_interval: 5m repeat_interval: 4h group_by: [alertname] routes:
    • receiver: webhook group_wait: 10s match: team: node receivers:
  • name: webhook webhook_configs:
  1. - 配置 prometheus同上邮箱的配置
  2. - [设置自启](https://www.cnblogs.com/Wshile/p/12938377.html)参考
  3. ```shell
  4. vi /usr/lib/systemd/system/alertmanager.service
  5. # 添加数据
  6. [Unit]
  7. Description=alertmanager
  8. Documentation=https://prometheus.io/
  9. After=network.target
  10. [Service]
  11. Type=simple
  12. User=prometheus
  13. # alertmanager地址
  14. ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml
  15. Restart=on-failure
  16. [Install]
  17. WantedBy=multi-user.target
  18. chown -R prometheus.prometheus /usr/lib/systemd/system/alertmanager.service
  19. # 启动
  20. systemctl start alertmanager
  21. # 重启
  22. systemctl restart alertmanager
  23. #查看端口
  24. netstat -anpt | grep 9093
  • 查看prometheus
  • image.png
  • 钉钉配置

    • 命令记录

      1. # 下载
      2. wget https://github.com/timonwong/prometheus-webhook-dingtalk/releases/download/v1.4.0/prometheus-webhook-dingtalk-1.4.0.linux-amd64.tar.gz
      3. # 解压
      4. tar -xvf prometheus-webhook-dingtalk-1.4.0.linux-amd64.tar.gz
      5. # 重命名
      6. mv prometheus-webhook-dingtalk-1.4.0.linux-amd64 prometheus-webhook-dingtalk
    • 配置文件 ```yaml

      Request timeout

      timeout: 5s

Customizable templates path

templates:

  • /config/example.tmpl

targets: webhook1: &target_base url: https://oapi.dingtalk.com/robot/send?access_token=xxxx message: title: ‘{{ template “example.title” . }}’ text: ‘{{ template “example.content” . }}’

webhook2: <<: *target_base mention:

  1. # 此处必须声明 Mention 的号码...
  2. mobiles: ["我的电话号码"]
  3. message:
  4. text: |
  5. @我的电话号码
  6. {{ template "example.content" . }}

```

注意

  • prometheus.yaml 配置 alertmanagers节点时最好些写ip+prot
  • 邮件配置参考报警邮箱配置