说明

Kubernetes的kubelet组件内置了cadvisor,将Node上容器的指标以Prometheus支持的格式展示,可以通过这些指标计算得到更多有用的数据。

访问:10255/metrics/cadvisor,可以读取到以Prometheus支持的格式呈现的指标:
image.png
在prometheus的配置文件中配置相关的target之后,这些指标就可以从Prometheus中查询到

  1. ...
  2. - job_name: kubernetes-apiservers
  3. kubernetes_sd_configs:
  4. - role: endpoints
  5. relabel_configs:
  6. - action: keep
  7. regex: default;kubernetes;https
  8. source_labels:
  9. - __meta_kubernetes_namespace
  10. - __meta_kubernetes_service_name
  11. - __meta_kubernetes_endpoint_port_name
  12. scheme: https
  13. tls_config:
  14. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  15. insecure_skip_verify: true
  16. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  17. - job_name: kubernetes-nodes-kubelet
  18. kubernetes_sd_configs:
  19. - role: node
  20. relabel_configs:
  21. - action: labelmap
  22. regex: __meta_kubernetes_node_label_(.+)
  23. scheme: https
  24. tls_config:
  25. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  26. insecure_skip_verify: true
  27. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  28. - job_name: kubernetes-nodes-cadvisor
  29. kubernetes_sd_configs:
  30. - role: node
  31. relabel_configs:
  32. - action: labelmap
  33. regex: __meta_kubernetes_node_label_(.+)
  34. - target_label: __metrics_path__
  35. replacement: /metrics/cadvisor
  36. scheme: https
  37. tls_config:
  38. ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
  39. insecure_skip_verify: true
  40. bearer_token_file: /var/run/secrets/kubernetes.io/serviceaccount/token
  41. - job_name: kubernetes-service-endpoints
  42. kubernetes_sd_configs:
  43. - role: endpoints
  44. relabel_configs:
  45. - action: keep
  46. regex: true
  47. source_labels:
  48. - __meta_kubernetes_service_annotation_prometheus_io_scrape
  49. - action: replace
  50. regex: (https?)
  51. source_labels:
  52. - __meta_kubernetes_service_annotation_prometheus_io_scheme
  53. target_label: __scheme__
  54. - action: replace
  55. regex: (.+)
  56. source_labels:
  57. - __meta_kubernetes_service_annotation_prometheus_io_path
  58. target_label: __metrics_path__
  59. - action: replace
  60. regex: ([^:]+)(?::\d+)?;(\d+)
  61. replacement: $1:$2
  62. source_labels:
  63. - __address__
  64. - __meta_kubernetes_service_annotation_prometheus_io_port
  65. target_label: __address__
  66. - action: labelmap
  67. regex: __meta_kubernetes_service_label_(.+)
  68. - action: replace
  69. source_labels:
  70. - __meta_kubernetes_namespace
  71. target_label: kubernetes_namespace
  72. - action: replace
  73. source_labels:
  74. - __meta_kubernetes_service_name
  75. target_label: kubernetes_name
  76. - job_name: kubernetes-services
  77. kubernetes_sd_configs:
  78. - role: service
  79. metrics_path: /probe
  80. params:
  81. module:
  82. - http_2xx
  83. relabel_configs:
  84. - action: keep
  85. regex: true
  86. source_labels:
  87. - __meta_kubernetes_service_annotation_prometheus_io_probe
  88. - source_labels:
  89. - __address__
  90. target_label: __param_target
  91. - replacement: blackbox
  92. target_label: __address__
  93. - source_labels:
  94. - __param_target
  95. target_label: instance
  96. - action: labelmap
  97. regex: __meta_kubernetes_service_label_(.+)
  98. - source_labels:
  99. - __meta_kubernetes_namespace
  100. target_label: kubernetes_namespace
  101. - source_labels:
  102. - __meta_kubernetes_service_name
  103. target_label: kubernetes_name
  104. - job_name: kubernetes-pods
  105. kubernetes_sd_configs:
  106. - role: pod
  107. relabel_configs:
  108. - action: keep
  109. regex: true
  110. source_labels:
  111. - __meta_kubernetes_pod_annotation_prometheus_io_scrape
  112. - action: replace
  113. regex: (.+)
  114. source_labels:
  115. - __meta_kubernetes_pod_annotation_prometheus_io_path
  116. target_label: __metrics_path__
  117. - action: replace
  118. regex: ([^:]+)(?::\d+)?;(\d+)
  119. replacement: $1:$2
  120. source_labels:
  121. - __address__
  122. - __meta_kubernetes_pod_annotation_prometheus_io_port
  123. target_label: __address__
  124. - action: labelmap
  125. regex: __meta_kubernetes_pod_label_(.+)
  126. - action: replace
  127. source_labels:
  128. - __meta_kubernetes_namespace
  129. target_label: kubernetes_namespace
  130. - action: replace
  131. source_labels:
  132. - __meta_kubernetes_pod_name
  133. target_label: kubernetes_pod_name
  134. ...

容器CPU使用率的计算

man top手册中找到了CPU使用率的定义:

  1. %CPU -- CPU Usage
  2. The task's share of the elapsed CPU time since the last screen update, expressed as a percentage of total CPU time.
  3. In a true SMP environment, if a process is multi-threaded and top is not operating in Threads mode, amounts greater
  4. than 100% may be reported. You toggle Threads mode with the `H' inter-active command.
  5. Also for multi-processor environments, if Irix mode is Off, top will operate in Solaris mode where a task's cpu usage
  6. will be divided by the total number of CPUs. You toggle Irix/Solaris modes with the `I' interactive command.

即在过去的一段时间里进程占用的CPU时间与CPU总时间的比率,如果有多个CPU或者多核,需要将每个CPU的时间相加

kubelet中的cadvisor采集的指标,其中有一项是

  1. container_cpu_usage_seconds_total Counter Cumulative cpu time consumed seconds

container_cpu_usage_seconds_total是container累计使用的CPU时间,用它除以CPU的总时间,就得到了容器的CPU使用率

  1. # 每分钟内的变化
  2. sum (rate (container_cpu_usage_seconds_total{image!="",name=~"^k8s_.*",io_kubernetes_container_name!="POD",pod_name=~"^$Deployment$Statefulset$Daemonset.*$",kubernetes_io_hostname=~"^$Node$"}[1m])) by (pod_name,kubernetes_io_hostname)

内存使用率

内存使用率可以使用container_memory_working_set_bytes来进行查看

  1. sum (container_memory_working_set_bytes{id!="/",pod_name=~"^$Deployment$Statefulset$Daemonset.*$",kubernetes_io_hostname=~"^$Node$"}) by (pod_name,kubernetes_io_hostname)

k8s的官方grafana

  • 3131 Kubernetes All Nodes


知识链接

  1. 通过Prometheus查询计算Kubernetes集群中的容器CPU、内存使用率等指标