感知Pod状态变化和探测状态改变的流程
004.kubelet容器状态同步管理
代码路径: /pkg /kubelet /status /status_manager.go
接口定义
PodStatusProvider: 提供获取 PodStatus 的方法
Manager:提供最新 Pod 状态的存储载体,也负责同步更新到 API Server
// PodStatusProvider knows how to provide status for a pod. It's intended to be used by other components
// that need to introspect status.
type PodStatusProvider interface {
// GetPodStatus returns the cached status for the provided pod UID, as well as whether it
// was a cache hit.
GetPodStatus(uid types.UID) (v1.PodStatus, bool)
}
// Manager is the Source of truth for kubelet pod status, and should be kept up-to-date with
// the latest v1.PodStatus. It also syncs updates back to the API server.
type Manager interface {
PodStatusProvider
// Start the API server status sync loop.
Start()
// SetPodStatus caches updates the cached status for the given pod, and triggers a status update.
SetPodStatus(pod *v1.Pod, status v1.PodStatus)
// SetContainerReadiness updates the cached container status with the given readiness, and
// triggers a status update.
SetContainerReadiness(podUID types.UID, containerID kubecontainer.ContainerID, ready bool)
// SetContainerStartup updates the cached container status with the given startup, and
// triggers a status update.
SetContainerStartup(podUID types.UID, containerID kubecontainer.ContainerID, started bool)
// TerminatePod resets the container status for the provided pod to terminated and triggers
// a status update.
TerminatePod(pod *v1.Pod)
// RemoveOrphanedStatuses scans the status cache and removes any entries for pods not included in
// the provided podUIDs.
RemoveOrphanedStatuses(podUIDs map[types.UID]bool)
}
Status Manager 与 Kubelet 交互图
需要注意的是,最后都会流转到 updateStatusInternal 处理,所以我们主要看的核心逻辑不言而喻
结构体
PodStatus:PodStatus 表示有关 pod 状态的信息。状态可能跟踪系统的实际状态,尤其是当承载pod的节点无法联系控制平面时。(注: pod 状态信息字段较多,跳转查看:)
// PodStatus represents information about the status of a pod. Status may trail the actual
// state of a system, especially if the node that hosts the pod cannot contact the control
// plane.
type PodStatus struct {
// The phase of a Pod is a simple, high-level summary of where the Pod is in its lifecycle.
// The conditions array, the reason and message fields, and the individual container status
// arrays contain more detail about the pod's status.
// There are five possible phase values:
//
// Pending: The pod has been accepted by the Kubernetes system, but one or more of the
// container images has not been created. This includes time before being scheduled as
// well as time spent downloading images over the network, which could take a while.
// Running: The pod has been bound to a node, and all of the containers have been created.
// At least one container is still running, or is in the process of starting or restarting.
// Succeeded: All containers in the pod have terminated in success, and will not be restarted.
// Failed: All containers in the pod have terminated, and at least one container has
// terminated in failure. The container either exited with non-zero status or was terminated
// by the system.
// Unknown: For some reason the state of the pod could not be obtained, typically due to an
// error in communicating with the host of the pod.
//
// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-phase
// +optional
Phase PodPhase `json:"phase,omitempty" protobuf:"bytes,1,opt,name=phase,casttype=PodPhase"`
// Current service state of pod.
// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-conditions
// +optional
// +patchMergeKey=type
// +patchStrategy=merge
Conditions []PodCondition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type" protobuf:"bytes,2,rep,name=conditions"`
// A human readable message indicating details about why the pod is in this condition.
// +optional
Message string `json:"message,omitempty" protobuf:"bytes,3,opt,name=message"`
// A brief CamelCase message indicating details about why the pod is in this state.
// e.g. 'Evicted'
// +optional
Reason string `json:"reason,omitempty" protobuf:"bytes,4,opt,name=reason"`
// nominatedNodeName is set only when this pod preempts other pods on the node, but it cannot be
// scheduled right away as preemption victims receive their graceful termination periods.
// This field does not guarantee that the pod will be scheduled on this node. Scheduler may decide
// to place the pod elsewhere if other nodes become available sooner. Scheduler may also decide to
// give the resources on this node to a higher priority pod that is created after preemption.
// As a result, this field may be different than PodSpec.nodeName when the pod is
// scheduled.
// +optional
NominatedNodeName string `json:"nominatedNodeName,omitempty" protobuf:"bytes,11,opt,name=nominatedNodeName"`
// IP address of the host to which the pod is assigned. Empty if not yet scheduled.
// +optional
HostIP string `json:"hostIP,omitempty" protobuf:"bytes,5,opt,name=hostIP"`
// IP address allocated to the pod. Routable at least within the cluster.
// Empty if not yet allocated.
// +optional
PodIP string `json:"podIP,omitempty" protobuf:"bytes,6,opt,name=podIP"`
// podIPs holds the IP addresses allocated to the pod. If this field is specified, the 0th entry must
// match the podIP field. Pods may be allocated at most 1 value for each of IPv4 and IPv6. This list
// is empty if no IPs have been allocated yet.
// +optional
// +patchStrategy=merge
// +patchMergeKey=ip
PodIPs []PodIP `json:"podIPs,omitempty" protobuf:"bytes,12,rep,name=podIPs" patchStrategy:"merge" patchMergeKey:"ip"`
// RFC 3339 date and time at which the object was acknowledged by the Kubelet.
// This is before the Kubelet pulled the container image(s) for the pod.
// +optional
StartTime *metav1.Time `json:"startTime,omitempty" protobuf:"bytes,7,opt,name=startTime"`
// The list has one entry per init container in the manifest. The most recent successful
// init container will have ready = true, the most recently started container will have
// startTime set.
// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-and-container-status
InitContainerStatuses []ContainerStatus `json:"initContainerStatuses,omitempty" protobuf:"bytes,10,rep,name=initContainerStatuses"`
// The list has one entry per container in the manifest.
// More info: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle#pod-and-container-status
// +optional
ContainerStatuses []ContainerStatus `json:"containerStatuses,omitempty" protobuf:"bytes,8,rep,name=containerStatuses"`
// The Quality of Service (QOS) classification assigned to the pod based on resource requirements
// See PodQOSClass type for available QOS classes
// More info: https://git.k8s.io/community/contributors/design-proposals/node/resource-qos.md
// +optional
QOSClass PodQOSClass `json:"qosClass,omitempty" protobuf:"bytes,9,rep,name=qosClass"`
// Status for any ephemeral containers that have run in this pod.
// This field is beta-level and available on clusters that haven't disabled the EphemeralContainers feature gate.
// +optional
EphemeralContainerStatuses []ContainerStatus `json:"ephemeralContainerStatuses,omitempty" protobuf:"bytes,13,rep,name=ephemeralContainerStatuses"`
}
versionedPodStatus:包装了 v1.PodStatus,额外提供了版本信息,保证过时的版本不会发送到 API Server 去
// A wrapper around v1.PodStatus that includes a version to enforce that stale pod statuses are
// not sent to the API server.
type versionedPodStatus struct {
status v1.PodStatus
// Monotonically increasing version number (per pod).
version uint64
// Pod name & namespace, for sending updates to API server.
podName string
podNamespace string
}
podStstusSyncRequest:同步 pod 请求的包装结构体
type podStatusSyncRequest struct {
podUID types.UID
status versionedPodStatus
}
manager:线程安全的实例,会同步 Pod 的状态到 API Server,只在新状态变化的时候执行写操作
// Updates pod statuses in apiserver. Writes only when new status has changed.
// All methods are thread-safe.
type manager struct {
kubeClient clientset.Interface
podManager kubepod.Manager
// Map from pod UID to sync status of the corresponding pod.
podStatuses map[types.UID]versionedPodStatus
podStatusesLock sync.RWMutex
podStatusChannel chan podStatusSyncRequest
// Map from (mirror) pod UID to latest status version successfully sent to the API server.
// apiStatusVersions must only be accessed from the sync thread.
apiStatusVersions map[kubetypes.MirrorPodUID]uint64
podDeletionSafety PodDeletionSafetyProvider
}
核心逻辑
练习
PR Need:needsReconcile 需要简化逻辑
Help Wanted
描述:这里不得不传递 Static Pod 的 uid,因为 pod manager 只支持通过 static pod 获取 mirror pod。
读者可以开始一个 PR 练习
// needsReconcile compares the given status with the status in the pod manager (which
// in fact comes from apiserver), returns whether the status needs to be reconciled with
// the apiserver. Now when pod status is inconsistent between apiserver and kubelet,
// kubelet should forcibly send an update to reconcile the inconsistence, because kubelet
// should be the source of truth of pod status.
// NOTE(random-liu): It's simpler to pass in mirror pod uid and get mirror pod by uid, but
// now the pod manager only supports getting mirror pod by static pod, so we have to pass
// static pod uid here.
// TODO(random-liu): Simplify the logic when mirror pod manager is added.
func (m *manager) needsReconcile(uid types.UID, status v1.PodStatus) bool {
// The pod could be a static pod, so we should translate first.
pod, ok := m.podManager.GetPodByUID(uid)
if !ok {
klog.V(4).InfoS("Pod has been deleted, no need to reconcile", "podUID", string(uid))
return false
}
// If the pod is a static pod, we should check its mirror pod, because only status in mirror pod is meaningful to us.
if kubetypes.IsStaticPod(pod) {
mirrorPod, ok := m.podManager.GetMirrorPodByPod(pod)
if !ok {
klog.V(4).InfoS("Static pod has no corresponding mirror pod, no need to reconcile", "pod", klog.KObj(pod))
return false
}
pod = mirrorPod
}
podStatus := pod.Status.DeepCopy()
normalizeStatus(pod, podStatus)
if isPodStatusByKubeletEqual(podStatus, &status) {
// If the status from the source is the same with the cached status,
// reconcile is not needed. Just return.
return false
}
klog.V(3).InfoS("Pod status is inconsistent with cached status for pod, a reconciliation should be triggered",
"pod", klog.KObj(pod),
"statusDiff", diff.ObjectDiff(podStatus, &status))
return true
}