问题:
测试环境k8s集群因为内存资源耗尽,所有的pod都无法提供服务了,因此考虑加入一台服务器作为worker工作节点,但是加入集群以后,发现有的pod服务启动失败,pod状态“RunContainerError”
使用”kubectl logs -f roxe-rss21-6cd84ff4c4-jzt6n” 查询pod日志发现下面报错
failed to try resolving symlinks in path “/var/log/pods/default_roxe-rss21-6cd84ff4c4-jzt6n_39338afc-a986-4da3-b10d-613e4b272fc1/roxe-rss21/3.log”: lstat /var/log/pods/default_roxe-rss21-6cd84ff4c4-jzt6n_39338afc-a986-4da3-b10d-613e4b272fc1/roxe-rss21/3.log: no such file or directory
从上面的报错看不出任何问题引起的,使用“kubectl describe pod roxe-rss21-6cd84ff4c4-jzt6n”,可以看出下面的报错:
Warning FailedCreatePodSandBox 96s kubelet, k8s-worker-node3 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "f114145d33d80456de30398987abc02421936fca498cebf7d4dc4657a70d00f7" network for pod "roxe-rss21-6cd84ff4c4-jzt6n": networkPlugin cni failed to set up pod "roxe-rss21-6cd84ff4c4-jzt6n_default" network: failed to Statfs "/proc/5377/ns/net": no such file or directory
Normal Pulling 95s kubelet, k8s-worker-node3 Pulling image "harbor.123.top/test-images/roxe-rss21:39"
Normal Pulled 93s kubelet, k8s-worker-node3 Successfully pulled image "harbor.123.top/test-images/roxe-rss21:39" in 2.205340566s
Warning Failed 93s kubelet, k8s-worker-node3 Error: failed to start container "roxe-rss21": Error response from daemon: cannot join network of a non running container: f032f1c11f72c9c19fb8fb1b0d93d704496c0e0afa21778f91a6113658537bb8
Warning Failed 92s kubelet, k8s-worker-node3 Error: failed to start container "roxe-rss21": Error response from daemon: cannot join network of a non running container: 2d7d8181fd5f5adc1fdb0f5d23057b869c2661c1da865aef603d0dd3497d1331
Normal Created 92s (x2 over 93s) kubelet, k8s-worker-node3 Created container roxe-rss21
Normal Pulled 92s kubelet, k8s-worker-node3 Container image "harbor.123.top/test-images/roxe-rss21:39" already present on machine
Warning FailedCreatePodSandBox 90s kubelet, k8s-worker-node3 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "6ce1814cd347c6cbd3c07ed07f6694a52782997b877b0705ecb51e064c00f8b5" network for pod "roxe-rss21-6cd84ff4c4-jzt6n": networkPlugin cni failed to set up pod "roxe-rss21-6cd84ff4c4-jzt6n_default" network: failed to open netns "/proc/6021/ns/net": failed to Statfs "/proc/6021/ns/net": no such file or directory
Warning FailedCreatePodSandBox 88s kubelet, k8s-worker-node3 Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "22366750cb6d6d0ea7a2560fbed2b41d45e7197d267408e5a8db19069300a11e" network for pod "roxe-rss21-6cd84ff4c4-jzt6n": networkPlugin cni failed to set up pod "roxe-rss21-6cd84ff4c4-jzt6n_default" network: failed to Statfs "/proc/6312/ns/net": no such file or directory
Warning BackOff 85s (x5 over 91s) kubelet, k8s-worker-node3 Back-off restarting failed container
Normal SandboxChanged 84s (x10 over 95s) kubelet, k8s-worker-node3 Pod sandbox changed, it will be killed and re-created.
可能是之前反复添加过这个服务器作为worker节点,再次添加之前需要清除下网络。