    我们来个例子实践下,准备如下yaml配置并保存为 liveness.yaml

    1. apiVersion: v1
    2. kind: Pod
    3. metadata:
    4. labels:
    5. test: liveness
    6. name: liveness
    7. spec:
    8. restartPolicy: OnFailure
    9. containers:
    10. - name: liveness
    11. image: busybox
    12. args:
    13. - /bin/sh
    14. - -c
    15. - touch /tmp/healthy; sleep 30; rm -f /tmp/healthy; sleep 600
    16. livenessProbe:
    17. exec:
    18. command:
    19. - cat
    20. - /tmp/healthy
    21. initialDelaySeconds: 10 # 容器启动 10 秒之后开始检测
    22. periodSeconds: 5 # 每隔 5 秒再检测一次

    启动进程首先创建文件 /tmp/healthy,30 秒后删除,在我们的设定中,如果 /tmp/healthy 文件存在,则认为容器处于正常状态,反正则发生故障。
    livenessProbe 部分定义如何执行 Liveness 检测:
    检测的方法是:通过 cat 命令检查 /tmp/healthy 文件是否存在。如果命令执行成功,返回值为零,K8s 则认为本次 Liveness 检测成功;如果命令返回值非零,本次 Liveness 检测失败。
    initialDelaySeconds: 10 指定容器启动 10 之后开始执行 Liveness 检测,我们一般会根据应用启动的准备时间来设置。比如某个应用正常启动要花 30 秒,那么 initialDelaySeconds 的值就应该大于 30。
    periodSeconds: 5 指定每 5 秒执行一次 Liveness 检测。K8s 如果连续执行 3 次 Liveness 检测均失败,则会杀掉并重启容器。

    1. # kubectl apply -f liveness.yaml
    2. pod/liveness created

    从配置文件可知,最开始的 30 秒,/tmp/healthy 存在,cat 命令返回 0,Liveness 检测成功,这段时间 kubectl describe pod liveness 的 Events部分会显示正常的日志

    # kubectl describe pod liveness
      Type     Reason     Age              From               Message
      ----     ------     ----             ----               -------
      Normal   Scheduled  53s              default-scheduler  Successfully assigned default/liveness to
      Normal   Pulling    52s              kubelet            Pulling image "busybox"
      Normal   Pulled     43s              kubelet            Successfully pulled image "busybox"
      Normal   Created    43s              kubelet            Created container liveness
      Normal   Started    42s              kubelet            Started container liveness

    35 秒之后,日志会显示 /tmp/healthy 已经不存在,Liveness 检测失败。再过几十秒,几次检测都失败后,容器会被重启。

      Type     Reason     Age                  From               Message
      ----     ------     ----                 ----               -------
      Normal   Scheduled  3m53s                default-scheduler  Successfully assigned default/liveness to
      Normal   Pulling    73s (x3 over 3m52s)  kubelet            Pulling image "busybox"
      Normal   Pulled     62s (x3 over 3m43s)  kubelet            Successfully pulled image "busybox"
      Normal   Created    62s (x3 over 3m43s)  kubelet            Created container liveness
      Normal   Started    62s (x3 over 3m42s)  kubelet            Started container liveness
      Warning  Unhealthy  18s (x9 over 3m8s)   kubelet            Liveness probe failed: cat: can't open '/tmp/healthy': No such file or directory
      Normal   Killing    18s (x3 over 2m58s)  kubelet            Container liveness failed liveness probe, will be restarted

    除了 Liveness 检测,Kubernetes Health Check 机制还包括 Readiness 检测。