Prometheus

下载地址:https://github.com/prometheus/prometheus/releases

  1. wget https://github.com/prometheus/prometheus/releases/download/v2.32.1/prometheus-2.32.1.linux-amd64.tar.gz
  2. tar zxvf prometheus-*.linux-amd64.tar.gz -C /usr/local
  3. mv /usr/local/prometheus-*.linux-amd64 /usr/local/prometheus
  4. cat >/etc/systemd/system/prometheus.service << EOF
  5. [Unit]
  6. Description=Prometheus
  7. Documentation=https://prometheus.io/
  8. After=network.target
  9. [Service]
  10. Type=simple # Type设置为notify时,服务会不断重启
  11. User=prometheus
  12. # --storage.tsdb.path是可选项,默认数据目录在运行目录的./dada目录中
  13. ExecStart=/usr/local/prometheus/prometheus --config.file=/usr/local/prometheus/prometheus.yml \
  14. --storage.tsdb.path=/home/data/prometheus
  15. Restart=on-failure
  16. [Install]
  17. WantedBy=multi-user.target
  18. EOF
  19. groupadd prometheus
  20. useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus
  21. mkdir /home/data/prometheus -p # 数据盘
  22. chown prometheus.prometheus -R /usr/local/prometheus /home/data/prometheus
  23. systemctl start prometheus && systemctl enable prometheus
  24. systemctl status prometheus

基础配置

  1. # 全局配置
  2. global:
  3. scrape_interval: 15s # 设置抓取间隔,默认为1分钟
  4. evaluation_interval: 15s # 估算规则的默认周期,每15秒计算一次规则。默认1分钟
  5. scrape_timeout # 默认抓取超时,默认为10s


  1. - job_name: 'linux'
  2. static_configs:
  3. - targets: ['192.168.0.119:9100']

重载配置

  1. kill -1 `pgrep prometheus`
  2. curl -XPOST http://127.0.0.1:9090/-/reload
  3. # prometheus.service 添加--web.enable-lifecycle

node-exporter

下载地址: https://github.com/prometheus/node_exporter/releases

用于机器系统数据收集,监控服务器CPU、内存、磁盘、I/O等信息。

二进制安装

  1. wget https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz
  2. tar zxvf node_exporter-*.linux-amd64.tar.gz
  3. mv node_exporter-*.linux-amd64/node_exporter /usr/local/bin/
  4. cat > /etc/systemd/system/node_exporter.service << EOF
  5. [Unit]
  6. Description=node_exporter
  7. Documentation=https://prometheus.io/
  8. After=network.target
  9. [Service]
  10. Type=simple
  11. User=prometheus
  12. ExecStart=/usr/local/bin/node_exporter
  13. Restart=on-failure
  14. [Install]
  15. WantedBy=multi-user.target
  16. EOF
  17. groupadd prometheus
  18. useradd -g prometheus -m -d /var/lib/prometheus -s /sbin/nologin prometheus
  19. systemctl start node_exporter && systemctl enable node_exporter

docker启动

  1. docker run -d -p 9100:9100 \
  2. -v "/proc:/host/proc:ro" \
  3. -v "/sys:/host/sys:ro" \
  4. -v "/:/rootfs:ro" \
  5. --net="host" \
  6. prom/node-exporter

metrics采集接口

Node Exporter默认的抓取地址为http://IP:9100/metrics

grafana大屏

https://grafana.com/grafana/dashboards/

image.png

告警规则

https://awesome-prometheus-alerts.grep.to

https://www.cnblogs.com/heian99/p/15257897.html

consul服务发现

  1. docker run --name consul -d -p 8500:8500 consul

注册实例到consul

  1. curl -X PUT -d '{
  2. "id": "host-121",
  3. "name": "node-exporter",
  4. "address": "192.168.0.120",
  5. "port": 9100,
  6. "tags": ["linux"],
  7. "meta": {
  8. "group": "kong",
  9. "environment": "Pro",
  10. "project": "API_Platform"
  11. },
  12. "checks": [ {
  13. "http": "http://192.168.0.120:9100/metrics",
  14. "interval": "5s"
  15. }]}' \
  16. http://127.0.0.1:8500/v1/agent/service/register

释放consul注册

  1. curl -X PUT http://127.0.0.1:8500/v1/agent/service/deregister/id名称

prometheus配置

  1. scrape_configs:
  2. # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  3. - job_name: 'consul-prometheus'
  4. consul_sd_configs: # 配置基于consul的服务发现
  5. - server: 192.168.0.119:8500 # consul地址
  6. #token: 8dc1eb67-1f5f-4e10-ad9d-5e58b047647c # 自定义的token
  7. refresh_interval: 10s # 刷新间隔
  8. services: ['node-exporter']
  9. relabel_configs: # 对默认的Metadata进行自定义Relabeling
  10. - source_labels: [__meta_consul_service_address]
  11. target_label: 'ipaddress'
  12. - source_labels: [__meta_consul_service_id]
  13. target_label: 'instance'
  14. - source_labels: [__meta_consul_service_metadata_group]
  15. target_label: 'group'
  16. - source_labels: [__meta_consul_service_metadata_environment]
  17. target_label: 'environment'
  18. - source_labels: [__meta_consul_service_metadata_project]
  19. target_label: 'project'
  20. - source_labels: [__meta_consul_service]
  21. target_label: 'service'

效果

image.png

image.png

Grafana模板:
Node Exporter Full -自定义群组-1642944683091.json