metrics server简介

早期版本从1.10-之前的都是通都支持heapster来提供监控指标,从1.8版本开始以支持metrics API的方式提供标准化接口,并且从1.10之后将heapster替换为metrics server,在k8s新的监控体系中,metrics server用于提供核心指标(node,pod的cpu和内存使用指标),对于其它自定义指标(custom metrics)的监控由prometheus 等组件来完成

代码托管于github仓库 https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/metrics-server

metrics server部署

创建一个apiserver

v1beta1.metrics.k8s.io

  1. cat > 1.metrics-apiservice.yaml << EOF
  2. apiVersion: apiregistration.k8s.io/v1beta1
  3. kind: APIService
  4. metadata:
  5. name: v1beta1.metrics.k8s.io
  6. spec:
  7. service:
  8. name: metrics-server
  9. namespace: kube-system
  10. group: metrics.k8s.io
  11. version: v1beta1
  12. insecureSkipTLSVerify: true
  13. groupPriorityMinimum: 100
  14. versionPriority: 100
  15. EOF

image.png

创建deployment资源,ServiceAccount资源,nanny的 ConfigMap资源

cat > 2.metrics-server-deployment.yaml << EOF
apiVersion: v1
kind: ServiceAccount
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
---
apiVersion: v1
kind: ConfigMap
metadata:
  name: metrics-server-config
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: EnsureExists
data:
  NannyConfiguration: |-
    apiVersion: nannyconfig/v1alpha1
    kind: NannyConfiguration
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: metrics-server-v0.3.6
  namespace: kube-system
  labels:
    k8s-app: metrics-server
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
    version: v0.3.6
spec:
  selector:
    matchLabels:
      k8s-app: metrics-server
      version: v0.3.6
  template:
    metadata:
      name: metrics-server
      labels:
        k8s-app: metrics-server
        version: v0.3.6
      annotations:
        seccomp.security.alpha.kubernetes.io/pod: 'docker/default'
    spec:
      priorityClassName: system-cluster-critical
      serviceAccountName: metrics-server
      containers:
      - name: metrics-server
        image: 10.0.0.70:5000/metrics-server-amd64:v0.3.6
        command:
        - /metrics-server
        - --kubelet-insecure-tls
        - --metric-resolution=30s
        # These are needed for GKE, which doesn't support secure communication yet.
        # Remove these lines for non-GKE clusters, and when GKE supports token-based auth.
       # - --kubelet-port=10255
       # - --deprecated-kubelet-completely-insecure=true
        - --kubelet-preferred-address-types=InternalIP,Hostname,InternalDNS,ExternalDNS,ExternalIP
        ports:
        - containerPort: 443
          name: https
          protocol: TCP
      - name: metrics-server-nanny
        image: 10.0.0.70:5000/addon-resizer:1.8.6
        resources:
          limits:
            cpu: 100m
            memory: 300Mi
          requests:
            cpu: 5m
            memory: 50Mi
        env:
          - name: MY_POD_NAME
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: MY_POD_NAMESPACE
            valueFrom:
              fieldRef:
                fieldPath: metadata.namespace
        volumeMounts:
        - name: metrics-server-config-volume
          mountPath: /etc/config
        command:
          - /pod_nanny
          - --config-dir=/etc/config
          - --cpu=80m
         # - --cpu={{ base_metrics_server_cpu }}
          - --extra-cpu=0.5m
         # - --memory={{ base_metrics_server_memory }}
          - --memory=80Mi
         # - --extra-memory={{ metrics_server_memory_per_node }}Mi
          - --extra-memory=8Mi
          - --threshold=5
          - --deployment=metrics-server-v0.3.6
          - --container=metrics-server
          - --poll-period=300000
          - --estimator=exponential
          # Specifies the smallest cluster (defined in number of nodes)
          # resources will be scaled to.
         # - --minClusterSize={{ metrics_server_min_cluster_size }}#关闭这个
      volumes:
        - name: metrics-server-config-volume
          configMap:
            name: metrics-server-config
      tolerations:
        - key: "CriticalAddonsOnly"
          operator: "Exists"
EOF

创建一个svc资源给metrics-server

cat > 3.metrics-server-service.yaml <<EOF 
apiVersion: v1
kind: Service
metadata:
  name: metrics-server
  namespace: kube-system
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    kubernetes.io/cluster-service: "true"
    kubernetes.io/name: "Metrics-server"
spec:
  selector:
    k8s-app: metrics-server
  ports:
  - port: 443
    protocol: TCP
    targetPort: https
EOF

创建一个rolebindig

让metrics-server拥有extension-apiserver-authentication-reader的权限

cat > 4.auth-reader.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: metrics-server-auth-reader
  namespace: kube-system
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
EOF

创建一个ClusterRoleBinding

让metrics-server拥有system:auth-delegator的权限

cat > 5.auth-delegator.yaml << EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: metrics-server:system:auth-delegator
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
EOF

创建一个ClusterRoleBinding

让metrics-server拥有system:metrics-server的权限

cat > 6.resource-reader.yaml << EOF 
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:metrics-server
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - nodes
  - nodes/stats
  - namespaces
  verbs:
  - get
  - list
  - watch
- apiGroups:
  - "apps"
  resources:
  - deployments
  verbs:
  - get
  - list
  - update
  - watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:metrics-server
  labels:
    kubernetes.io/cluster-service: "true"
    addonmanager.kubernetes.io/mode: Reconcile
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:metrics-server
subjects:
- kind: ServiceAccount
  name: metrics-server
  namespace: kube-system
EOF

检查资源是否创建成功

image.png

检查是否能收集到数据

image.png