Ceph - 10.Ceph回填速度缓慢 - 《运维机器人》

1、背景
2、参数查看和调整
3、查看调整后的参数和recovery速度

1、背景

Ceph集群中本来有9个OSD，近期出现空间不足的情况。随机增加5个OSD到集群，加入集群后进入recovery和backfill状态，此过程需要很长的时间。通过ceph -s看到recovery速度仅为18 objects/s。因此需要对recovery速度进行优化。

]$ ceph -s
  cluster:
    id:     5adf323c-bef2-42b4-8eff-7a164be1c7fa
    health: HEALTH_WARN
            Degraded data redundancy: 2246888/205599948 objects degraded (1.093%), 10 pgs degraded, 10 pgs undersized
  services:
    mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 11m)
    mgr: ceph-mon1(active, since 6d), standbys: ceph-mon2
    mds: cephfs:1 {0=ceph-mon2=up:active} 1 up:standby
    osd: 14 osds: 14 up (since 33h), 14 in (since 41h); 114 remapped pgs
    rgw: 3 daemons active (ceph-mon1, ceph-mon2, ceph-mon3)
  task status:
  data:
    pools:   8 pools, 352 pgs
    objects: 34.27M objects, 2.9 TiB
    usage:   5.4 TiB used, 7.3 TiB / 13 TiB avail
    pgs:     2246888/205599948 objects degraded (1.093%)
             77284550/205599948 objects misplaced (37.590%)
             238 active+clean
             96  active+remapped+backfill_wait
             10  active+undersized+degraded+remapped+backfilling
             8   active+remapped+backfilling
  io:
    client:   1.1 MiB/s rd, 14 op/s rd, 0 op/s wr
    recovery: 1.3 MiB/s, 18 objects/s
  progress:
    Rebalancing after osd.13 marked in
      [=============================.]
    Rebalancing after osd.10 marked in
      [=======================.......]
    Rebalancing after osd.11 marked in
      [===========================...]
    Rebalancing after osd.12 marked in
      [===========================...]
    Rebalancing after osd.9 marked in
      [==================............]

2、参数查看和调整

2.1、查看recovery和backfill相关参数

]$ ceph daemon osd.0 config show |grep -E 'backfill|recovery'
    "bluefs_replay_recovery": "false",
    "bluefs_replay_recovery_disable_compact": "false",
    "mon_osd_backfillfull_ratio": "0.900000",
    "osd_allow_recovery_below_min_size": "true",
    "osd_async_recovery_min_cost": "100",
    "osd_backfill_retry_interval": "30.000000",
    "osd_backfill_scan_max": "512",
    "osd_backfill_scan_min": "64",
    "osd_debug_pretend_recovery_active": "false",
    "osd_debug_reject_backfill_probability": "0.000000",
    "osd_debug_skip_full_check_in_backfill_reservation": "false",
    "osd_debug_skip_full_check_in_recovery": "false",
    "osd_force_recovery_pg_log_entries_factor": "1.300000",
    "osd_kill_backfill_at": "0",
    "osd_max_backfills": "1",
    "osd_min_recovery_priority": "0",
    "osd_recovery_cost": "20971520",
    "osd_recovery_delay_start": "0.000000",
    "osd_recovery_max_active": "3",
    "osd_recovery_max_chunk": "8388608",
    "osd_recovery_max_omap_entries_per_chunk": "8096",
    "osd_recovery_max_single_start": "1",
    "osd_recovery_op_priority": "3",
    "osd_recovery_op_warn_multiple": "16",
    "osd_recovery_priority": "5",
    "osd_recovery_retry_interval": "30.000000",
    "osd_recovery_sleep": "0.000000",
    "osd_recovery_sleep_hdd": "0.100000",
    "osd_recovery_sleep_hybrid": "0.025000",
    "osd_recovery_sleep_ssd": "0.100000",
    "osd_repair_during_recovery": "false",
    "osd_scrub_during_recovery": "false",

2.2、参数释义

osd-max-backfills：
- 单个 OSD 允许的最大回填操作数。
- 默认：1
osd_recovery_max_active
- 每个osd的最大请求数
- 默认：3
osd_recovery_sleep_hdd
- luminous版本后出现,以前版本是osd_recovery_sleep, 每次recovery or backfill 的休眠时长
- 默认：0.1
osd_backfill_scan_min
- 每次backfill是最小扫描对象数
- 默认：64
osd_backfill_scan_max
- 每次backfill是最大扫描对象数
- 默认：512
  2.3、参数调整
  调整要根据具体集群情况而定，测试验证出最佳的参数，盲目调整参数性能可能不升反降，甚至影响正常读写请求。
```
osd max backfills = 10  # 8或者10，设置过大会导致slow query
osd recovery max active = 15
osd_recovery_sleep_hdd = 0
osd_recovery_sleep_ssd = 0
```
  2.4、调整指令
```
]$ ceph tell osd.\* injectargs '--osd_max_backfills=10'
]$ ceph tell osd.\* injectargs '--osd_recovery_max_active=15'
]$ ceph tell osd.\* injectargs '--osd_recovery_sleep_hdd=0'
]$ ceph tell osd.\* injectargs '--osd_recovery_sleep_ssd=0'
```
  3、查看调整后的参数和recovery速度
```bash ]$ ceph daemon osd.0 config show |grep -E ‘backfill|recovery’ “bluefs_replay_recovery”: “false”, “bluefs_replay_recovery_disable_compact”: “false”, “mon_osd_backfillfull_ratio”: “0.900000”, “osd_allow_recovery_below_min_size”: “true”, “osd_async_recovery_min_cost”: “100”, “osd_backfill_retry_interval”: “30.000000”, “osd_backfill_scan_max”: “512”, “osd_backfill_scan_min”: “64”, “osd_debug_pretend_recovery_active”: “false”, “osd_debug_reject_backfill_probability”: “0.000000”, “osd_debug_skip_full_check_in_backfill_reservation”: “false”, “osd_debug_skip_full_check_in_recovery”: “false”, “osd_force_recovery_pg_log_entries_factor”: “1.300000”, “osd_kill_backfill_at”: “0”, “osd_max_backfills”: “10”, “osd_min_recovery_priority”: “0”, “osd_recovery_cost”: “20971520”, “osd_recovery_delay_start”: “0.000000”, “osd_recovery_max_active”: “15”, “osd_recovery_max_chunk”: “8388608”, “osd_recovery_max_omap_entries_per_chunk”: “8096”, “osd_recovery_max_single_start”: “1”, “osd_recovery_op_priority”: “3”, “osd_recovery_op_warn_multiple”: “16”, “osd_recovery_priority”: “5”, “osd_recovery_retry_interval”: “30.000000”, “osd_recovery_sleep”: “0.000000”, “osd_recovery_sleep_hdd”: “0.000000”, “osd_recovery_sleep_hybrid”: “0.025000”, “osd_recovery_sleep_ssd”: “0.000000”, “osd_repair_during_recovery”: “false”, “osd_scrub_during_recovery”: “false”,

]$ ceph -s cluster: id: 5adf323c-bef2-42b4-8eff-7a164be1c7fa health: HEALTH_WARN Degraded data redundancy: 1862088/205599948 objects degraded (0.906%), 10 pgs degraded, 10 pgs undersized

services: mon: 3 daemons, quorum ceph-mon1,ceph-mon2,ceph-mon3 (age 51m) mgr: ceph-mon1(active, since 6d), standbys: ceph-mon2 mds: cephfs:1 {0=ceph-mon2=up:active} 1 up:standby osd: 14 osds: 14 up (since 34h), 14 in (since 42h); 114 remapped pgs rgw: 3 daemons active (ceph-mon1, ceph-mon2, ceph-mon3)

task status:

data: pools: 8 pools, 352 pgs objects: 34.27M objects, 2.9 TiB usage: 5.4 TiB used, 7.3 TiB / 13 TiB avail pgs: 1862088/205599948 objects degraded (0.906%) 75829821/205599948 objects misplaced (36.882%) 238 active+clean 96 active+remapped+backfill_wait 10 active+undersized+degraded+remapped+backfilling 8 active+remapped+backfilling

io: client: 435 KiB/s rd, 11 op/s rd, 0 op/s wr recovery: 41 MiB/s, 460 objects/s

progress: Rebalancing after osd.13 marked in [=============================.] Rebalancing after osd.10 marked in [=======================…….] Rebalancing after osd.11 marked in [===========================…] Rebalancing after osd.12 marked in [===========================…] Rebalancing after osd.9 marked in [==================…………]

``` 可以看到恢复速度已经达到460 objects/s

10.Ceph回填速度缓慢

1、背景

2、参数查看和调整

2.1、查看recovery和backfill相关参数

2.2、参数释义

2.3、参数调整

2.4、调整指令

3、查看调整后的参数和recovery速度