NFS 共享出现有坏的超级块
yum -y install nfs-utils
systemctl start nfs-utils
systemctl enable nfs-utils
Unable to connect to the server: x509
如果主节点提示:
Unable to connect to the server: x509: certificate signed by unknown authority (possibly because of "crypto/rsa: verification error" while trying to verify candidate authority certificate "kubernetes")
这是由于启动 k8s 后忘记了后续操作, 执行以下命令即可
mkdir -p $HOME/.kube
sudo \cp /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
node(s) had taints
错误提示:
3 node(s) had taints that the pod didn't tolerate,
直译意思是节点有了污点无法容忍,执行 kubectl get no -o yaml | grep taint -A 5
之后发现该节点是不可调度的。这是因为 kubernetes 出于安全考虑默认情况下无法在 master 节点上部署 pod,于是用下面方法解决:
kubectl taint nodes --all node-role.kubernetes.io/master-
node(s) didn’t match node selector
错误提示:
4 node(s) didn't match node selector
如果指定的 label 在所有 node 上都无法匹配,则创建 Pod 失败,会提示无法调度
解决方案: 为 node 打标签
kubectl label nodes <node-name> <label-key>=<label-value>
node “xxx” not found
在使用kubectl命令的时候,未响应,使用 systemctl status kubelet
查看kubelet的状态,发现是running状态,但提示 node "xxx" not found
。
错误详情:
$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/usr/lib/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /usr/lib/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Mon 2020-11-02 18:35:54 PST; 3min 33s ago
Docs: https://kubernetes.io/docs/
Main PID: 976 (kubelet)
Tasks: 15
Memory: 158.0M
CGroup: /system.slice/kubelet.service
└─976 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kube...
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.016623 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.117495 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.217709 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.318944 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.419470 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.521154 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.622236 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.722583 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.824282 976 kubelet.go:2183] node "localhost.localdomain" not found
Nov 02 18:39:27 localhost.localdomain kubelet[976]: E1102 18:39:27.925153 976 kubelet.go:2183] node "localhost.localdomain" not found
使用日志命令跟踪错误:
journalctl -u kubelet | tail -n 100
发现是节点的IP找不到了:
Nov 02 18:41:40 localhost.localdomain kubelet[976]: E1102 18:41:40.688215 976 controller.go:136] failed to ensure node lease exists, will retry in 7s, error: Get "https://192.168.34.128:6443/apis/coordination.k8s.io/v1/namespaces/kube-node-lease/leases/localhost.localdomain?timeout=10s": net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
这是因为我之前改过虚拟机的网段,导致网络地址改变了:
查看虚拟机的ip:
$ ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.5.128 netmask 255.255.255.0 broadcast 192.168.5.255
inet6 fe80::7f7:b42a:f7ec:b0d0 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:4b:e5:fb txqueuelen 1000 (Ethernet)
RX packets 620 bytes 47995 (46.8 KiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1806 bytes 169049 (165.0 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
发现从以前的 192.168.34.128
变为了现在的 192.168.5.128
。
解决方法:
- 进入到
/etc/kubernetes
- 修改所有涉及到网络地址的conf
- 将其修改为当前的网络地址 ```bash [root@localhost ~]# cd /etc/kubernetes [root@localhost kubernetes]# ls admin.conf controller-manager.conf kubelet.conf manifests pki scheduler.conf [root@localhost kubernetes]# cat admin.conf apiVersion: v1 clusters:
- cluster: certificate-authority-data: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUM1ekNDQWMrZ0F3SUJBZ0lCQURBTkJna3Foa2lHOXcwQkFRc0ZBREFWTVJNd0VRWURWUVFERXdwcmRXSmwKY201bGRHVnpNQjRYRFRJd01URXdNakEwTlRJMU5Wb1hEVE13TVRBek1UQTBOVEkxTlZvd0ZURVRNQkVHQTFVRQpBeE1LYTNWaVpYSnVaWFJsY3pDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBTGloCmY2bkdrN3pWQm13S0w4dDd4U2NjV1RWT0hpK0NwVzBqMlp6Qy9xUEJNQkRudGZkV0lzTmlKSGlBTUswejA4QlcKUGxzVkVMTmhpTXJ5cUlGekt2YzI1Y1FHaWtmY1JLUHcrelpyOVZsK0J6WHJhNGNWUmdwdEd6NmlPL0hUelVEMQpIdFZPelpYM3JzUW9SeU5YQ0FqaENzei9xOWxYVWRGYVdsdlpvR2RaYzhsVHdOVUJwdm5MTHlCeTM4cndEVGlnCmJTVGpsWlFpeTFRd2NSaUtHWWJaL2VBVUpweGNGQTg4SlpXb3ArUWxOL3dGVnpIMzgxZ2JIT3NxTjZKNzhLWUIKeW5ZU01qS3BIK0dTa3ZqaFMybmplcmN4WTBveHNpOU5zTzRJaXBKUkVmSHMzV2diQ0ZIY2M4NDdXeGlQb1o1TAo2dXFIRFNOZ0FhQWVicFk1YlE4Q0F3RUFBYU5DTUVBd0RnWURWUjBQQVFIL0JBUURBZ0trTUE4R0ExVWRFd0VCCi93UUZNQU1CQWY4d0hRWURWUjBPQkJZRUZOYjlQR0FhR2x1V1ZkcHExVWRYTXdqZ0hkYUtNQTBHQ1NxR1NJYjMKRFFFQkN3VUFBNElCQVFDZUZzbUc4T2FpTnVqcU5DVjlaNS85dVc2K282RnYxMUw3dDExZmp1MXhMbUZrVzFMRgp1SnZ3RzZrVXc1YW1maXpjdUo2bll6cnJ6YW8wNzlmWXpUZkJhMC8wc0RRU2QxMWpWR2Z6YU1qeWVGSVFNNDVWCjErU3REUWtuTVhGQUtxZGFHQ0EwK1FvcDcxcGxGOFJlK3Y5aDk3MVhzajlWbVJWcGIvanFteHBrRExFWkIrOSsKc2dPdmlRV1VKc3ZEd1hrRm1iRi9mdHdvbkhoZXRER0luZnFDaGlwZWE3UFI4bng3aG5zRkQrRm1TdWtiZFB0eQpBSmp1QnpHcWt1d1BRTEpmVzgveDhpeVRuWHNjVVZZcHVmSFJmRDFISFVlSXYrMnVtMFlMRlJ6a0Q2Q2hBWUpNCmtqOVhCd1M5WnVjUDJrZE5YM3hpTlNqZWdhcnVPTDIvMVVSWgotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== server: https://192.168.34.128:6443 name: kubernetes contexts:
- context:
cluster: kubernetes
user: kubernetes-admin
name: kubernetes-admin@kubernetes
current-context: kubernetes-admin@kubernetes
`` 比如上面的示例,将 https://192.168.34.128:6443 改为 https://192.168.5.128:6443,其他文件,如
controller-manager.conf、
kubelet.conf、
scheduler.conf也得对应修改,
manifests` 文件夹下的配置也得对应修改。
修改举例:
sed 's/192.168.34.128/192.168.5.128/g' admin.conf
如果嫌麻烦,可以将当前文件夹下的所有文件替换掉:
[root@localhost kubernetes]# sed -i 's/192.168.34.128/192.168.5.128/g' `grep -rl 192.168.34.128 ./`
[root@localhost kubernetes]# cd manifests
[root@localhost manifests]# sed -i 's/192.168.34.128/192.168.5.128/g' `grep -rl 192.168.34.128 ./`
修改完后,使用命令重启 kubelet:
systemctl restart kubelet
如果是静态网络地址,/etc/sysconfig/network-scripts/ifcfg-ens33
中的静态地址也得对应修改,如果是DHCP就不用了。