Liveness检测让我们可以自定义条件来判断容器是否健康,如果检测失败,则K8s会重启容器
我们来个例子实践下,准备如下yaml配置并保存为 liveness.yaml
apiVersion: v1
kind: Pod
metadata:
labels:
test: liveness
name: liveness
spec:
restartPolicy: OnFailure
containers:
- name: liveness
image: busybox
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600
livenessProbe:
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 10 # 容器启动 10 秒之后开始检测
periodSeconds: 5 # 每隔 5 秒再检测一次
启动进程首先创建文件 /tmp/healthy,30 秒后删除,在我们的设定中,如果 /tmp/healthy 文件存在,则认为容器处于正常状态,反正则发生故障。
livenessProbe 部分定义如何执行 Liveness 检测:
检测的方法是:通过 cat 命令检查 /tmp/healthy 文件是否存在。如果命令执行成功,返回值为零,K8s 则认为本次 Liveness 检测成功;如果命令返回值非零,本次 Liveness 检测失败。
initialDelaySeconds: 10 指定容器启动 10 之后开始执行 Liveness 检测,我们一般会根据应用启动的准备时间来设置。比如某个应用正常启动要花 30 秒,那么 initialDelaySeconds 的值就应该大于 30。
periodSeconds: 5 指定每 5 秒执行一次 Liveness 检测。K8s 如果连续执行 3 次 Liveness 检测均失败,则会杀掉并重启容器。
接着来创建这个Pod:
# kubectl apply -f liveness.yaml
pod/liveness created
从配置文件可知,最开始的 30 秒,/tmp/healthy 存在,cat 命令返回 0,Liveness 检测成功,这段时间 kubectl describe pod liveness 的 Events部分会显示正常的日志
# kubectl describe pod liveness
......
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 53s default-scheduler Successfully assigned default/liveness to 10.0.1.203
Normal Pulling 52s kubelet Pulling image "busybox"
Normal Pulled 43s kubelet Successfully pulled image "busybox"
Normal Created 43s kubelet Created container liveness
Normal Started 42s kubelet Started container liveness
35 秒之后,日志会显示 /tmp/healthy 已经不存在,Liveness 检测失败。再过几十秒,几次检测都失败后,容器会被重启。
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 3m53s default-scheduler Successfully assigned default/liveness to 10.0.1.203
Normal Pulling 73s (x3 over 3m52s) kubelet Pulling image "busybox"
Normal Pulled 62s (x3 over 3m43s) kubelet Successfully pulled image "busybox"
Normal Created 62s (x3 over 3m43s) kubelet Created container liveness
Normal Started 62s (x3 over 3m42s) kubelet Started container liveness
Warning Unhealthy 18s (x9 over 3m8s) kubelet Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
Normal Killing 18s (x3 over 2m58s) kubelet Container liveness failed liveness probe, will be restarted
除了 Liveness 检测,Kubernetes Health Check 机制还包括 Readiness 检测。