安装

注意:

  • 通过环境变量设置管理员账户密码
    • GF_SECURITY_ADMIN_USER
    • GF_SECURITY_ADMIN_PASSWORD
  • 通过设置 securityContext 的方式让 grafana 进程使用 root 启动
  • 数据挂载到本地
  • 配置 ingress 暴露访问入口

说明:
下文操作基于已实现以 nfs 为存储资源的 storage-class 数据存储环境。

  1. $ cat grafana-all.yaml
  2. kind: PersistentVolumeClaim
  3. apiVersion: v1
  4. metadata:
  5. name: grafana
  6. namespace: monitor
  7. spec:
  8. accessModes:
  9. - ReadWriteOnce
  10. storageClassName: course-nfs-storage
  11. resources:
  12. requests:
  13. storage: 200Gi
  14. ---
  15. apiVersion: apps/v1
  16. kind: Deployment
  17. metadata:
  18. name: grafana
  19. namespace: monitor
  20. spec:
  21. selector:
  22. matchLabels:
  23. app: grafana
  24. template:
  25. metadata:
  26. labels:
  27. app: grafana
  28. spec:
  29. volumes:
  30. - name: storage
  31. persistentVolumeClaim:
  32. claimName: grafana
  33. securityContext:
  34. runAsUser: 0
  35. containers:
  36. - name: grafana
  37. image: grafana/grafana:7.5.9
  38. imagePullPolicy: IfNotPresent
  39. ports:
  40. - containerPort: 3000
  41. name: grafana
  42. env:
  43. - name: GF_SECURITY_ADMIN_USER
  44. value: admin
  45. - name: GF_SECURITY_ADMIN_PASSWORD
  46. value: admin
  47. readinessProbe:
  48. failureThreshold: 10
  49. httpGet:
  50. path: /api/health
  51. port: 3000
  52. scheme: HTTP
  53. initialDelaySeconds: 60
  54. periodSeconds: 10
  55. successThreshold: 1
  56. timeoutSeconds: 30
  57. livenessProbe:
  58. failureThreshold: 3
  59. httpGet:
  60. path: /api/health
  61. port: 3000
  62. scheme: HTTP
  63. periodSeconds: 10
  64. successThreshold: 1
  65. timeoutSeconds: 1
  66. resources:
  67. limits:
  68. cpu: 150m
  69. memory: 512Mi
  70. requests:
  71. cpu: 150m
  72. memory: 512Mi
  73. volumeMounts:
  74. - mountPath: /var/lib/grafana
  75. name: storage
  76. ---
  77. apiVersion: v1
  78. kind: Service
  79. metadata:
  80. name: grafana
  81. namespace: monitor
  82. spec:
  83. type: ClusterIP
  84. ports:
  85. - port: 3000
  86. selector:
  87. app: grafana
  88. ---
  89. apiVersion: networking.k8s.io/v1
  90. kind: Ingress
  91. metadata:
  92. name: grafana
  93. namespace: monitor
  94. spec:
  95. rules:
  96. - host: grafana.crab.com
  97. http:
  98. paths:
  99. - path: /
  100. pathType: Prefix
  101. backend:
  102. service:
  103. name: grafana
  104. port:
  105. number: 3000
$ kubectl apply -f grafana-all.yaml

添加 hosts 解析(ip为任何node节点ip信息)后访问域名 http://grafana.crab.com admin/admin
image.png

配置

添加数据源

Grafana 页面操作,依次点击 Configuration > Data Sources > Add data source > Prometheus
image.png

证书配置用集群的证书信息(~/.kube/config)内容,certificate-authority-data、client-certificate-data、client-key-data 分别对应 CA 证书、Client 证书、Client 私钥。

config 文件里内容是 base64 编码过后的,所以填写前要做 base64 解码。

解析工具:https://base64.supfree.net/

命令方式解码

[root@master ~]# cat /root/.kube/config |grep 'certificate-authority-data'
certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EUXdNVEEzTkRJek9Gb1hEVE13TURNek1EQTNOREl6T0Zvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTG80CkxTVTVVN2FscUUxdDJBbzkzTDcxWTZxcFNHQ3RGOTV4MmxxYnRwejlPd1A3ZHVGa3h6QmpmSGhvSlNOd3NMNlEKd2Y5NzJjUVVSUFpNUDgwakNFblhiTkZqVndPMHVQMDZjQlllQ2NwQlQ4QnVyOUFRSW1LYk9SWEJZSldTRjFSRApHVzVid3A5N2JEUHdyNnh3U0FucjB0bEpuSjdlSWdaNkZRZmF4UE5tL25rQjJTSHl3WVVkaDIrNVNEaGZLQ1dVCllRNWxHbVZneCtLLzlodkl3OEFyQVIzSzBRNUg5ZGF0MjA4YVNZN3hCMkJJYzlDVjkrSDRyb0YvOTBWU3dZYkMKZ3hibVNxRm56Z0c1ZVBtWjVKT2pGbllTR24ySHpEZGxPVFdnY290Zkx0cnNWSUlLY3FFdzZ1Wkc0NFJ6RXBaSgpiVGR0WDUvdlcvQkpVejhodmxzQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFKQWc5bUswQzgrTHo0MHZzRHBPN21oRWRkODIKMHcxODRVL2Y1Z3pYNEtIVzFhVXV2MXhTL1dsNnpCY2JOY0RZQ3Z0eHlHY1ZZcnhRQitoT3NOVTFhbzEyYThzQwovN1VadjROUnlKN3I1czdqek43ODNST21NYzJzbUVDTCtja0k4ei9aMWw4eFJCSk5LTExPa1Jta3cwWVYzNStaClppQTRRTE5acU1hOTNTUFhzNXorZld6UHNxNE5XR2tvUzYyVDNRTlVqcitsdEpKM2Nmd0M3R2JnZ2JkVCttWE8KclVQVHhCL2ZHd0REVXdJb1BXOHZBL1J5T0NSUTBBTFB4MTBaNFFBTWtNaDUydndkMGxod2VVWkREUWJ2UEFkQgovbmt2S0NGcTBnRnF0emVXK1JCTEhHSTM0a2FjTlhjY3c1WlpuVHBKMjN1NHhHUW9VeGl6Vy9TUWs1RT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=

[root@master~]# a=LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUN5RENDQWJDZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01EUXdNVEEzTkRJek9Gb1hEVE13TURNek1EQTNOREl6T0Zvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTG80CkxTVTVVN2FscUUxdDJBbzkzTDcxWTZxcFNHQ3RGOTV4MmxxYnRwejlPd1A3ZHVGa3h6QmpmSGhvSlNOd3NMNlEKd2Y5NzJjUVVSUFpNUDgwakNFblhiTkZqVndPMHVQMDZjQlllQ2NwQlQ4QnVyOUFRSW1LYk9SWEJZSldTRjFSRApHVzVid3A5N2JEUHdyNnh3U0FucjB0bEpuSjdlSWdaNkZRZmF4UE5tL25rQjJTSHl3WVVkaDIrNVNEaGZLQ1dVCllRNWxHbVZneCtLLzlodkl3OEFyQVIzSzBRNUg5ZGF0MjA4YVNZN3hCMkJJYzlDVjkrSDRyb0YvOTBWU3dZYkMKZ3hibVNxRm56Z0c1ZVBtWjVKT2pGbllTR24ySHpEZGxPVFdnY290Zkx0cnNWSUlLY3FFdzZ1Wkc0NFJ6RXBaSgpiVGR0WDUvdlcvQkpVejhodmxzQ0F3RUFBYU1qTUNFd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0RRWUpLb1pJaHZjTkFRRUxCUUFEZ2dFQkFKQWc5bUswQzgrTHo0MHZzRHBPN21oRWRkODIKMHcxODRVL2Y1Z3pYNEtIVzFhVXV2MXhTL1dsNnpCY2JOY0RZQ3Z0eHlHY1ZZcnhRQitoT3NOVTFhbzEyYThzQwovN1VadjROUnlKN3I1czdqek43ODNST21NYzJzbUVDTCtja0k4ei9aMWw4eFJCSk5LTExPa1Jta3cwWVYzNStaClppQTRRTE5acU1hOTNTUFhzNXorZld6UHNxNE5XR2tvUzYyVDNRTlVqcitsdEpKM2Nmd0M3R2JnZ2JkVCttWE8KclVQVHhCL2ZHd0REVXdJb1BXOHZBL1J5T0NSUTBBTFB4MTBaNFFBTWtNaDUydndkMGxod2VVWkREUWJ2UEFkQgovbmt2S0NGcTBnRnF0emVXK1JCTEhHSTM0a2FjTlhjY3c1WlpuVHBKMjN1NHhHUW9VeGl6Vy9TUWs1RT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo=

[root@master~]# echo $a | base64 -d
-----BEGIN CERTIFICATE-----
MIICyDCCAbCgAwIBAgIBADANBgkqhkiG9w0BAQsFADAVMRMwEQYDVQQDEwprdWJl
cm5ldGVzMB4XDTIwMDQwMTA3NDIzOFoXDTMwMDMzMDA3NDIzOFowFTETMBEGA1UE
AxMKa3ViZXJuZXRlczCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBALo4
LSU5U7alqE1t2Ao93L71Y6qpSGCtF95x2lqbtpz9OwP7duFkxzBjfHhoJSNwsL6Q
wf972cQURPZMP80jCEnXbNFjVwO0uP06cBYeCcpBT8Bur9AQImKbORXBYJWSF1RD
GW5bwp97bDPwr6xwSAnr0tlJnJ7eIgZ6FQfaxPNm/nkB2SHywYUdh2+5SDhfKCWU
YQ5lGmVgx+K/9hvIw8ArAR3K0Q5H9dat208aSY7xB2BIc9CV9+H4roF/90VSwYbC
gxbmSqFnzgG5ePmZ5JOjFnYSGn2HzDdlOTWgcotfLtrsVIIKcqEw6uZG44RzEpZJ
bTdtX5/vW/BJUz8hvlsCAwEAAaMjMCEwDgYDVR0PAQH/BAQDAgKkMA8GA1UdEwEB
/wQFMAMBAf8wDQYJKoZIhvcNAQELBQADggEBAJAg9mK0C8+Lz40vsDpO7mhEdd82
0w184U/f5gzX4KHW1aUuv1xS/Wl6zBcbNcDYCvtxyGcVYrxQB+hOsNU1ao12a8sC
/7UZv4NRyJ7r5s7jzN783ROmMc2smECL+ckI8z/Z1l8xRBJNKLLOkRmkw0YV35+Z
ZiA4QLNZqMa93SPXs5z+fWzPsq4NWGkoS62T3QNUjr+ltJJ3cfwC7GbggbdT+mXO
rUPTxB/fGwDDUwIoPW8vA/RyOCRQ0ALPx10Z4QAMkMh52vwd0lhweUZDDQbvPAdB
/nkvKCFq0gFqtzeW+RBLHGI34kacNXccw5ZZnTpJ23u4xGQoUxizW/SQk5E=
-----END CERTIFICATE-----

点击Save&Test连接测试
image.png

导入面板



Node Exporter https://grafana.com/grafana/dashboards/8919
Kubenetes: https://grafana.com/grafana/dashboards/13105

以 Node Export 配置进行说明
依次点击 Manage > Import > 填入 8919 > 选择 Prometheus 数据源,得到如下页面:
image.png

插件使用

插件方式获得监控数据展示图,依次点击 Configuration -> Plugins 可以查看已安装的插件。

通过 官方插件列表 可以获取可用插件,Kubernetes相关的插件:

DevOpsProdigy KubeGraf 是 Grafana 官方的 Kubernetes 插件的升级版本,通过图形直观展示集群的主要服务的指标和特征,还可以用于检查应用程序的生命周期和错误日志。

# 进入grafana容器内部执行安装(也可以下载离线包进行安装)
$ kubectl -n monitor exec -ti grafana-594f447d6c-jmjsw bash
bash-5.0# grafana-cli plugins install devopsprodigy-kubegraf-app 1.5.2
installing devopsprodigy-kubegraf-app @ 1.5.2
from: https://grafana.com/api/plugins/devopsprodigy-kubegraf-app/versions/1.4.1/download
into: /var/lib/grafana/plugins

✔ Installed devopsprodigy-kubegraf-app successfully

Restart grafana after installing plugins . <service grafana-server restart>

bash-5.0# grafana-cli plugins install grafana-piechart-panel

# 重建pod生效
$ kubectl -n monitor delete po grafana-594f447d6c-jmjsw

使用说明

Configuration -> Plugins ,点击插件进入详情页面,点击 [Enable] 按钮启用插件,点击 Set up your first k8s-cluster 创建新的 Kubernetes 集群:

  • Name:crab-k8s
  • URL:https://kubernetes.default:443
  • Access:使用默认的Server(default)
  • Skip TLS Verify:勾选,跳过证书合法性校验
  • Auth:勾选 TLS Client Auth 和 With CA Cert,证书内容来自~/.kube/config文件,要对文件中内容做一次 base64 解码
    • CA Cert:使用config文件中的certificate-authority-data对应的内容
    • Client Cert:使用config文件中的client-certificate-data对应的内容
    • Client Key:使用config文件中的client-key-data对应的内容

添加后自动生成几个Dashboard
image.png

随机查看
image.png

无数据处理

查看Dashboard
image.png

分别对应如下 Variables 信息
● Pod
label_values(kube_pod_info{namespace=”$namespace”},pod)
● DaemonSet
label_values(kube_pod_info{namespace=”daemonset-.“},pod)
● Deployment
label_values(kube_pod_info{namespace=”deployment-.
“},pod)

#以 ● Pod 项为例进行说明

依次点击 面板DevOpsProdigy KubeGraf Pod’s Dashboard > 右上角的 设置 按钮 > 左栏 Varialbes
image.png
##点击 pod 行,能看到 Query 内容能查询到数据后经过 Regex 过滤出所需数据(此处为 Pod 名称),显示在 Preview of values 栏,说明是有数据的。
image.png
##若 Preview of values 栏无数据,说明根据查询语句未能得出数据,可换上述 label_values(kube_pod_info{namespace=”$namespace”},pod) 方式进行查询,也可得出数据
image.png

#说明
label_values(kube_pod_info{namespace=”$namespace”},pod)
kube_pod_info{namespace=”$namespace”} 是 PromQL,label_values() 是 Grafana 语法

以 kube_pod_info{namespace=”default”} 为查询语句能够在 Prometheus 中查询到数据,每条数据都包含多条个字段信息,语句 label_values(kube_pod_info{namespace=”$namespace”},pod) 则能取出一条数据中的 pod 对应的值(即 pod 名称)image.png

自定义监控面板

要求:
实现根据集群节点名称进行过滤,得到对应节点的负载展示图的联动效果。

操作:
#依次点击 Create Dashboard > Add Panel > 右上角设置按钮 -> Variables -> Add Variable,添加一个变量 node 后保存
image.png
#页面上方展示出刚添加的 node 框
image.png
#依次点击 Panel Title > Edit,修改 Metrics 内容为 node_load1{instance=~”$node”}
image.png
#Apply后查看
image.png