当检查一个集群的状态时(执行ceph -w
或者ceph -s
),Ceph会汇报当前PG的状态,每个PG会有一个或多个状态,最优的PG状态是active + clean
。
下面是所有PG状态的具体解释:
creating
Ceph is still creating the placement group.
Ceph 仍在创建PG。
activating
The placement group is peered but not yet active.
PG已经互联,但是还没有active。
active
Ceph will process requests to the placement group.
Ceph 可处理到此PG的请求。
clean
Ceph replicated all objects in the placement group the correct
number of times.
PG内所有的对象都被正确的复制了对应的份数。
down
A replica with necessary data is down, so the placement group is
offline.
一个包含必备数据的副本离线,所以PG也离线了。
scrubbing
Ceph is checking the placement group metadata for inconsistencies.
Ceph 正在检查PG metadata的一致性。
deep
Ceph is checking the placement group data against stored checksums.
Ceph 正在检查PG数据和checksums的一致性。
degraded
Ceph has not replicated some objects in the placement group the
correct number of times yet.
PG中的一些对象还没有被复制到规定的份数。
inconsistent
Ceph detects inconsistencies in the one or more replicas of an
object in the placement group (e.g. objects are the wrong size,
objects are missing from one replica after recovery finished,
etc.).
Ceph检测到PG中对象的一份或多份数据不一致(比如对象大小不一直,或者恢复成功后对象依然没有等)
peering
The placement group is undergoing the peering process
PG正在互联过程中。
repair
Ceph is checking the placement group and repairing any
inconsistencies it finds (if possible).
Ceph正在检查PG并且修复所有发现的不一致情况(如果有的话)。
recovering
Ceph is migrating/synchronizing objects and their replicas.
Ceph正在迁移/同步对象和其副本。
forced_recovery
High recovery priority of that PG is enforced by user.
用户指定的PG高优先级恢复
recovery_wait
The placement group is waiting in line to start recover.
PG正在等待恢复被调度执行。
recovery_toofull
A recovery operation is waiting because the destination OSD is over
its full ratio.
恢复操作因为目标OSD容量超过指标而挂起。
recovery_unfound
Recovery stopped due to unfound objects.
恢复因为没有找到对应对象而停止。
backfilling
Ceph is scanning and synchronizing the entire contents of a
placement group instead of inferring what contents need to be
synchronized from the logs of recent operations. Backfill is a
special case of recovery.
Ceph正常扫描并同步整个PG的数据,而不是从最近的操作日志中推断需要同步的数据,Backfill(回填)是恢复的一个特殊状态。
forced_backfill
High backfill priority of that PG is enforced by user.
用户指定的高优先级backfill。
backfill_wait
The placement group is waiting in line to start backfill.
PG正在等待backfill被调度执行。
backfill_toofull
A backfill operation is waiting because the destination OSD is over
its full ratio.
backfill操作因为目标OSD容量超过指标而挂起。
backfill_unfound
Backfill stopped due to unfound objects.
Backfill因为没有找到对应对象而停止。
incomplete
Ceph detects that a placement group is missing information about
writes that may have occurred, or does not have any healthy copies.
If you see this state, try to start any failed OSDs that may contain
the needed information. In the case of an erasure coded pool
temporarily reducing min_size may allow recovery.
Ceph 探测到某一PG可能丢失了写入信息,或者没有健康的副本。如果你看到了这个状态,尝试启动有可能包含所需信息的失败OSD,
如果是erasure coded pool的话,临时调整一下min_size
也可能完成恢复。
stale
The placement group is in an unknown state - the monitors have not
received an update for it since the placement group mapping changed.
PG状态未知,从PG mapping更新后Monitor一直没有收到更新。
remapped
The placement group is temporarily mapped to a different set of OSDs
from what CRUSH specified.
PG被临时分配到了和CRUSH所指定的不同的OSD上。
undersized
The placement group has fewer copies than the configured pool
replication level.
该PG的副本数量小于存储池所配置的副本数量。
peered
The placement group has peered, but cannot serve client IO due to
not having enough copies to reach the pool\’s configured min_size
parameter. Recovery may occur in this state, so the pg may heal up
to min_size eventually.
PG已互联,但是不能向客户端提供服务,因为其副本数没达到本存储池的配置值( minsize 参数)。
在此状态下恢复会进行,所以此PG最终能达到 min_size 。
_snaptrim
Trimming snaps.
正在对快照做Trim操作。
snaptrim_Wait
Queued to trim snaps.
Trim操作等待被调度执行
snaptrim_Error
Error stopped trimming snaps.
Trim操作因为错误而停止