1. 问题描述

使用 kubeadm 安装 k8s 集群,在初始化步骤报错:

  1. sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.18.8
  2. [kubelet-check] It seems like the kubelet isn't running or healthy.
  3. [kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
  4. Unfortunately, an error has occurred:
  5. timed out waiting for the condition
  6. This error is likely caused by:
  7. - The kubelet is not running
  8. - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
  9. If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
  10. - 'systemctl status kubelet'
  11. - 'journalctl -xeu kubelet'
  12. Additionally, a control plane component may have crashed or exited when started by the container runtime.
  13. To troubleshoot, list all containers using your preferred container runtimes CLI.
  14. Here is one example how you may list all Kubernetes containers running in docker:
  15. - 'docker ps -a | grep kube | grep -v pause'
  16. Once you have found the failing container, you can inspect its logs with:
  17. - 'docker logs CONTAINERID'
  18. error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

查看 kubelet 状态 systemctl status kubelet,提示error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

2. 解决方案

kubelet 默认 cgroup driver 是 cgroupfs,但是 k8s 竟然推荐用 systemd ,那么解决办法就是将 docker 和 kubelet 的 cgroup driver 都配置为 systemed【官方推荐】

  • 重置未初始化成功的 kubeadm 配置: sudo kubeadm reset
  • 修改 docker,只需在 /etc/docker/daemon.json 中,添加 "exec-opts": ["native.cgroupdriver=systemd"] 即可;
  • 修改 kubelet

    # 修改 kubelet
    cat > /var/lib/kubelet/config.yaml <<EOF
    apiVersion: kubelet.config.k8s.io/v1beta1
    kind: KubeletConfiguration
    cgroupDriver: systemd
    EOF
    
  • 重启 docker 与 kubelet

    # 重启 docker 与 kubelet
    systemctl daemon-reload
    systemctl restart docker
    systemctl restart kubelet
    
  • 检查 docker info|grep Cgroup Driver 是否输出 Cgroup Driver: systemd

  • 检查 systemctl status kubelet ,发现仍然是 exited

    $ systemctl status kubelet
    ● kubelet.service - kubelet: The Kubernetes Node Agent
     Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
    Drop-In: /etc/systemd/system/kubelet.service.d
             └─10-kubeadm.conf
     Active: activating (auto-restart) (Result: exit-code) since Mon 2020-09-21 16:46:49 CST; 4s ago
       Docs: https://kubernetes.io/docs/home/
    Process: 26550 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
    Main PID: 26550 (code=exited, status=255)
    
    • 根据信息提示, /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 文件中的配置导致 kubelet 启动失败;
    • 移除 10-kubeadm.conf 文件中 Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=cgroupfs" 配置,这是早期部署步骤中遗留下的配置;