背景:

Prometheus(中文名:普罗米修斯)是由 SoundCloud 开发的开源监控报警系统和时序列数据库(TSDB)。
Prometheus官方下载地址:https://prometheus.io/download/
image.png

安装

  1. tar xf prometheus-2.28.1.linux-amd64.tar.gz

运行

cd prometheus-2.28.1.linux-amd64
./prometheus  --config.file=prometheus.yml

image.png
然后我们可以访问 http://<服务器IP地址>:9090,验证Prometheus是否已安装成功,web显示应该如下(9090端口注意开放)
image.png
通过点击下拉栏选取指标,点击”Excute” 我们能够看到Prometheus的性能指标。
image.png

设置prometheus系统服务,并配置开机启动

#运行用户创建
groupadd prometheus
useradd -g prometheus -m -d /opt/prometheus/ -s /sbin/nologin prometheus
touch /usr/lib/systemd/system/prometheus.service
chown prometheus:prometheus /usr/lib/systemd/system/prometheus.service
vim /usr/lib/systemd/system/prometheus.service

将如下配置写入prometheus.servie

[Unit]
Description=Prometheus
Wants=network-online.target
After=network-online.target

[Service]
User=prometheus
Group=prometheus
Type=simple
ExecStart=/opt/prometheus/prometheus \
--config.file /opt/prometheus/prometheus.yml \
--storage.tsdb.path /opt/prometheus/data \

[Install]
WantedBy=multi-user.target

Prometheus启动参数说明

  • —config.file — 指明prometheus的配置文件路径
  • —web.enable-lifecycle — 指明prometheus配置更改后可以进行热加载
  • —storage.tsdb.path — 指明监控数据存储路径
  • —storage.tsdb.retention —指明数据保留时间

    设置开机启动

    systemctl daemon-reload
    systemctl enable prometheus.service
    systemctl status prometheus.service
    systemctl restart prometheus.service
    
    image.png

    说明: prometheus在2.0之后默认的热加载配置没有开启, 配置修改后, 需要重启prometheus server才能生效, 这对于生产环境的监控是不可容忍的, 所以我们需要开启prometheus server的配置热加载功能.

prometheus.yml

# my global config
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.
  # scrape_timeout is set to the global default (10s).

# Alertmanager configuration
alerting:
  alertmanagers:
  - static_configs:
    - targets:
      # - alertmanager:9093

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:
  # - "first_rules.yml"
  # - "second_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
    - targets: ['localhost:9090']

  - job_name: 'Linux'
    static_configs:
    - targets: ['localhost:9100']

node_exporter

安装

tar xf node_exporter-1.1.2.linux-amd64.tar.gz

运行

cd node_exporter-1.1.2.linux-amd64
./node_exporter

image.png

验证node_exporter是否安装成功

curl 127.0.0.1:9100

image.png

curl 127.0.0.1:9100/metrics

image.png
返回一大堆性能指标。

设置系统服务

# 运行用户添加
groupadd node_exporter
useradd -g node_exporter -m -d /opt/node_exporter/ -s /sbin/nologin node_exporter 

# 系统服务配置 node_exporter 
touch /usr/lib/systemd/system/node_exporter.service 
chown node_exporter:node_exporter /usr/lib/systemd/system/node_exporter.service 
chown -R node_exporter:node_exporter /opt/node_exporter* 
vim /usr/lib/systemd/system/node_exporter.service

在node_exporter.service中加入如下代码:

[Unit]
Description=node_exporter
Wants=network-online.target
After=network-online.target

[Service]
User=node_exporter
Group=node_exporter
Type=simple
ExecStart=/opt/node_exporter/node_exporter
Restart=on-failure

[Install]
WantedBy=multi-user.target

启动 node_exporter 服务并设置开机启动

systemctl daemon-reload
systemctl enable node_exporter.service
systemctl start node_exporter.service
systemctl status node_exporter.service
systemctl restart node_exporter.service
systemctl start node_exporter.service
systemctl stop node_exporter.service

image.png
参考链接:
https://www.cnblogs.com/miaocbin/p/12009974.html
https://blog.51cto.com/youerning/2050543