一、HPA (Horizontal Pod Autoscaling)

1、HPA 须知:

  • HPA 通过监控分析一些控制器控制的所有 Pod 的负载变化情况来确定是否需要调整 Pod 的副本数量。
  • 创建 HPA 资源对象后,HPA Controller 默认 30s 轮询一次,查询 & 将负载与设定的值做对比,进而实现自动伸缩的功能。
  • 轮询间隔、缩容冷却时间窗口长度可以通过 kube-controller-manager 的参数 --horizontal-pod-autoscaler-sync-period, --horizontal-pod-autoscaler-downscale-stabilization 进行设置。

Pod 水平自动扩缩 | Kubernetes
Kubernetes HPA 使用详解-阳明的博客
The Guide To Kubernetes HPA by Example
k8s 监控(三)prometheus-adapter - 掘金 (这里包含对 HPA 规则的解释)

2、custom metrics 须知:

  • Aggregator,Kubernetes 聚合层扩展 API
  • APIService 资源:

    1. $ kubectl get apiservice | grep metrics
    2. v1beta1.custom.metrics.k8s.io monitoring/prometheus-adapter True
    3. v1beta1.metrics.k8s.io kube-system/metrics-server True
  • apiserver -> prometheus-adapter -> prometheus

  • HPA 规则

二、基于 CPU/内存的 HPA

0、部署 metrics-server

kube-apiserver 开启 Aggregator,部署 metrics-server。

1、CPU:

部署应用:

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: hpa-demo
  5. spec:
  6. selector:
  7. matchLabels:
  8. app: nginx
  9. template:
  10. metadata:
  11. labels:
  12. app: nginx
  13. spec:
  14. containers:
  15. - name: nginx
  16. image: nginx
  17. ports:
  18. - containerPort: 80
  19. resources:
  20. requests: # (required) requests 资源申明
  21. memory: 50Mi
  22. cpu: 50m

创建 HPA 对象:

  1. kubectl autoscale deployment hpa-demo --cpu-percent=10 --min=1 --max=10

2、内存:

部署应用,使用 configmap 挂载一个脚本,用于增大容器内存负载:

  1. apiVersion: v1
  2. kind: ConfigMap
  3. metadata:
  4. name: increase-mem-config
  5. data:
  6. increase-mem.sh: |
  7. #!/bin/bash
  8. mkdir /tmp/memory
  9. mount -t tmpfs -o size=40M tmpfs /tmp/memory
  10. dd if=/dev/zero of=/tmp/memory/block
  11. sleep 60
  12. rm /tmp/memory/block
  13. umount /tmp/memory
  14. rmdir /tmp/memory
  15. ---
  16. apiVersion: apps/v1
  17. kind: Deployment
  18. metadata:
  19. name: hpa-mem-demo
  20. spec:
  21. selector:
  22. matchLabels:
  23. app: nginx
  24. template:
  25. metadata:
  26. labels:
  27. app: nginx
  28. spec:
  29. volumes:
  30. - name: increase-mem-script
  31. configMap:
  32. name: increase-mem-config
  33. containers:
  34. - name: nginx
  35. image: nginx
  36. ports:
  37. - containerPort: 80
  38. volumeMounts:
  39. - name: increase-mem-script
  40. mountPath: /etc/script
  41. resources:
  42. requests:
  43. memory: 50Mi
  44. cpu: 50m
  45. securityContext: # 容器脚本中用到了 mount 命令,需要这个配置
  46. privileged: true

创建 HPA 对象:

  1. apiVersion: autoscaling/v2beta1
  2. kind: HorizontalPodAutoscaler
  3. metadata:
  4. name: nginx-hpa
  5. spec:
  6. scaleTargetRef:
  7. apiVersion: apps/v1
  8. kind: Deployment
  9. name: hpa-mem-demo
  10. minReplicas: 1
  11. maxReplicas: 5
  12. metrics:
  13. - type: Resource
  14. resource:
  15. name: memory
  16. targetAverageUtilization: 60

三、基于自定义指标的 HPA

1、应用的 http 请求数/tcp连接数:

说明:
单独在 k8s 集群中部署了一套 cAdvisor(由于负载过大,kubelet 中集成的 cAdvisor 关闭了一些指标的采集。这里仅做测试用,实际没必要再单独部署一套 cAdvisor),并打开采集 tcp 指标的开关,才会有下面用到的 container_network_tcp_usage_total 指标。

(1) 部署测试应用:
部署 nginx-vts,暴露 http_request 相关指标:

  1. apiVersion: apps/v1
  2. kind: Deployment
  3. metadata:
  4. name: hpa-nginx
  5. namespace: default
  6. spec:
  7. selector:
  8. matchLabels:
  9. app: nginx-server
  10. template:
  11. metadata:
  12. labels:
  13. app: nginx-server
  14. spec:
  15. containers:
  16. - name: nginx-vts
  17. image: cnych/nginx-vts:v1.0
  18. resources:
  19. limits:
  20. cpu: 50m
  21. requests:
  22. cpu: 50m
  23. ports:
  24. - containerPort: 80
  25. name: http
  26. ---
  27. apiVersion: v1
  28. kind: Service
  29. metadata:
  30. name: hpa-nginx
  31. namespace: default
  32. annotations:
  33. prometheus.io/scrape: "true"
  34. prometheus.io/port: "80"
  35. prometheus.io/path: "/status/format/prometheus"
  36. labels:
  37. app: nginx-server
  38. spec:
  39. ports:
  40. - port: 80
  41. targetPort: 80
  42. name: http
  43. selector:
  44. app: nginx-server
  45. type: NodePort
  46. ---
  47. apiVersion: monitoring.coreos.com/v1
  48. kind: ServiceMonitor
  49. metadata:
  50. name: ngx-vts-ends
  51. labels:
  52. release: prom
  53. spec:
  54. namespaceSelector:
  55. matchNames:
  56. - default
  57. selector:
  58. matchLabels:
  59. app: nginx-server
  60. endpoints:
  61. - port: http
  62. path: "/status/format/prometheus"

(2) 在 prometheus 控制台调试查询语句:

  1. #####
  2. # 说明:
  3. # 这里只是简单测试,语句不一定完全正确 ⊙﹏⊙∥
  4. ####
  5. # http request:
  6. sum(rate(nginx_vts_server_requests_total{code="total"}[1m])) by (namespace, pod)
  7. # tcp connection:
  8. container_network_tcp_usage_total{container_label_io_kubernetes_pod_name="hpa-ngx-bbb6c65bb-lzdkw",tcp_state!~"clos.*",tcp_state!~".*wait.*"}

(3) 创建 HPA 规则:

  1. rules:
  2. custom:
  3. - seriesQuery: 'container_network_tcp_usage_total'
  4. resources:
  5. overrides:
  6. container_label_io_kubernetes_pod_namespace:
  7. resource: namespace
  8. container_label_io_kubernetes_pod_name:
  9. resource: pod
  10. tcp_state:
  11. resource: tcp_state
  12. name:
  13. matches: "^(.*)_total"
  14. as: "${1}"
  15. metricsQuery: <<.Series>>{<<.LabelMatchers>>}
  16. - seriesQuery: 'nginx_vts_server_requests_total'
  17. resources:
  18. overrides:
  19. namespace:
  20. resource: namespace
  21. pod:
  22. resource: pods
  23. name:
  24. matches: "^(.*)_total"
  25. as: "${1}_per_second"
  26. metricsQuery: (sum(rate(<<.Series>>{<<.LabelMatchers>>}[1m])) by (<<.GroupBy>>))

确认规则生效:
image.png

(4) 创建 HPA 资源:

  1. apiVersion: autoscaling/v2beta1
  2. kind: HorizontalPodAutoscaler
  3. metadata:
  4. name: custom-hpa-nginx
  5. spec:
  6. scaleTargetRef:
  7. apiVersion: apps/v1
  8. kind: Deployment
  9. name: hpa-nginx
  10. minReplicas: 1
  11. maxReplicas: 3
  12. metrics:
  13. - type: Pods
  14. pods:
  15. metricName: nginx_vts_server_requests_per_second
  16. targetAverageValue: 500
  17. ## 这个未实际测试
  18. #- type: Pods
  19. # pods:
  20. # metricName: container_network_tcp_usage
  21. # targetAverageValue: 100

(3) 测试:
wrk 发送请求(按需调整并发参数):
image.png
在扩容了(这里是指标 nginx_vts_server_requests_per_second的结果):
image.png