1. 问题描述

使用 kubeadm 安装 k8s 集群，在初始化步骤报错：

sudo kubeadm init --pod-network-cidr=10.244.0.0/16 --kubernetes-version=v1.18.8
[kubelet-check] It seems like the kubelet isn't running or healthy.
[kubelet-check] The HTTP call equal to 'curl -sSL http://localhost:10248/healthz' failed with error: Get http://localhost:10248/healthz: dial tcp 127.0.0.1:10248: connect: connection refused.
        Unfortunately, an error has occurred:
                timed out waiting for the condition
        This error is likely caused by:
                - The kubelet is not running
                - The kubelet is unhealthy due to a misconfiguration of the node in some way (required cgroups disabled)
        If you are on a systemd-powered system, you can try to troubleshoot the error with the following commands:
                - 'systemctl status kubelet'
                - 'journalctl -xeu kubelet'
        Additionally, a control plane component may have crashed or exited when started by the container runtime.
        To troubleshoot, list all containers using your preferred container runtimes CLI.
        Here is one example how you may list all Kubernetes containers running in docker:
                - 'docker ps -a | grep kube | grep -v pause'
                Once you have found the failing container, you can inspect its logs with:
                - 'docker logs CONTAINERID'
error execution phase wait-control-plane: couldn't initialize a Kubernetes cluster

查看 kubelet 状态 systemctl status kubelet，提示error: failed to run Kubelet: failed to create kubelet: misconfiguration: kubelet cgroup driver: "systemd" is different from docker cgroup driver: "cgroupfs"

2. 解决方案

kubelet 默认 cgroup driver 是 cgroupfs，但是 k8s 竟然推荐用 systemd ，那么解决办法就是将 docker 和 kubelet 的 cgroup driver 都配置为 systemed 。【官方推荐】

重置未初始化成功的 kubeadm 配置： sudo kubeadm reset ；
修改 docker，只需在 /etc/docker/daemon.json 中，添加 "exec-opts": ["native.cgroupdriver=systemd"] 即可；

修改 kubelet

# 修改 kubelet
cat > /var/lib/kubelet/config.yaml <<EOF
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cgroupDriver: systemd
EOF

重启 docker 与 kubelet

# 重启 docker 与 kubelet
systemctl daemon-reload
systemctl restart docker
systemctl restart kubelet

检查 docker info|grep Cgroup Driver 是否输出 Cgroup Driver: systemd

检查 systemctl status kubelet ，发现仍然是 exited

$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
 Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/kubelet.service.d
         └─10-kubeadm.conf
 Active: activating (auto-restart) (Result: exit-code) since Mon 2020-09-21 16:46:49 CST; 4s ago
   Docs: https://kubernetes.io/docs/home/
Process: 26550 ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS (code=exited, status=255)
Main PID: 26550 (code=exited, status=255)

根据信息提示， /etc/systemd/system/kubelet.service.d/10-kubeadm.conf 文件中的配置导致 kubelet 启动失败；
移除 10-kubeadm.conf 文件中 Environment="KUBELET_EXTRA_ARGS=--cgroup-driver=cgroupfs" 配置，这是早期部署步骤中遗留下的配置；

Kubernetes专栏

k8s 错题集之 kubeadm初始化报错：failed to run Kubelet: misconfiguration: kubelet cgroup driver: "cgroupfs" is different from docker cgroup

1. 问题描述

2. 解决方案