硬件配置

配置 详情
服务器 Dell PowerEdge R630
CPU E5-2603 v4 * 2
内存 128GB
OS Disk SEAGATE 300G SAS * 2 (RAID1)
SSD Disk Intel DC S3710(400GB) * 1
OSD Disk Dell HDD(SAS) 2.4T 2.5英寸 * 5
网卡 万兆网卡 2 (public_network + cluster_network)+ 公网IP 1 + IPMI管理口
OS CentOS Linux release 7.8.2003 (Core)
Kernel 3.10.0-1127.el7.x86_64
Ceph v14.2.10 (nautilus)
ceph-deploy 2.0.1

系统架构

Hostname IP_Address Services
Ceph-node1 10.0.0.10,10.0.1.10 admin,osd,mon,mgr,mds
Ceph-node2 10.0.0.20,10.0.1.20 osd,mon,mgr,mds
Ceph-node3 10.0.0.30,10.0.1.30 osd,mon,mgr,mds
Ceph-client 10.0.1.40 /

部署流程

系统预检(所有节点)

配置yum源

添加CentOS7的阿里源
  1. #curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
  2. #curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

配置ceph源
# vim /etc/yum.repos.d/ceph.repo
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/$basearch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1

[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1

[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1

安装必要组件

安装系统基本组件

[root@Ceph1 yum.repos.d]# yum install vim net-tools lrzsz htop sysstat iotop iftop -y

安装ceph-deploy

# yum list ceph-deploy
Loaded plugins: fastestmirror, priorities
Determining fastest mirrors
8 packages excluded due to repository priority protections
Installed Packages
ceph-deploy.noarch                          2.0.1-0                           @Ceph-noarc

# yum install ceph-deploy
# Kernel pid max
echo 4194303 > /proc/sys/kernel/pid_max

修改hosts

# vim /etc/hosts     
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.10 ceph-node1
10.0.0.20 ceph-node2
10.0.0.30 ceph-node3

防火墙

systemctl stop firewalld.service
systemctl disable firewalld.service

Selinux

sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
setenforce 0

NTP时间同步

# 配置定时的时间同步

# crontab -e
*/1 * * * * /usr/sbin/ntpdate 1.1.1.1 >/dev/null 2>&1

配置SSH免密登录

# ssh-keygen

# ssh-copy-id root@Ceph-node1
# ssh-copy-id root@Ceph-node2
# ssh-copy-id root@Ceph-node3


# vim ~/.ssh/config
Host Ceph-node1
        Hostname Ceph-node1
        User root
Host Ceph-node2
        Hostname Ceph-node2
        User root
Host Ceph-node3
        Hostname Ceph-node3
        User root

可提前预先安装Ceph和Ceph-deploy管理组件(所有节点):

yum install ceph -y

# 执行这条可能会安装到1.x版本的ceph-deploy
yum install ceph-deploy -y

# 可选2.x版本
yum install -y http://mirrors.aliyun.com/ceph/rpm-mimic/el7/noarch/ceph-deploy-2.0.1-0.noarch.rpm

磁盘分区

磁盘分区仅供测试环境参考:一切以实际的生产环境为准

部署buluestore中的block-db与block-wal的分配比例参考


在 luminous 版本中,是用 ceph-volume 管理 OSD ,官方也推荐使用 lvm 管理磁盘,设置 LVM更加方便后面的批量部署和管理,这些都会在后面的命令有所体现

所以实际生产环境中,还是强烈建议使用LVM的方式管理分区和逻辑卷组!

【当然,在bluestore的部署中直接指定硬盘分区也是可以的(如果觉得LVM比较麻烦或者单纯用来测试的话)】

注意:

对data数据盘/dev/sdc进行分区处理是为了模拟生产环境中的多个osd,实际生产环境osd的数据盘并不需要分区。

如果下面出现ansible命令和shell命令的方式,只选择其中一种方式即可

OSD LVM

# VGS
#ansible:
[root@ceph-node1 ~]# ansible ceph -m shell -a 'vgcreate datavg1 /dev/sdb'
......
#shell
[root@ceph1 ~]# vim vgs.sh
#!/bin/bash
for i in {1..13}
do
        vgcreate datavg$i /dev/sdc$i 2>&1 >/dev/null
done



# LVS
# ansible:
[root@ceph-node1 ~]# ansible ceph -m shell -a 'lvcreate -n datalv1 -l 100%Free datavg1'
......
#shell
[root@ceph1 ~]# vim lvs.sh
#!/bin/bash
for i in {1..13}
do
        lvcreate datalv$i -l 100%Free datavg$i 2>&1 >/dev/null
done

wal/db LVM

VGS - wal
# ansible ceph -m shell -a 'vgcreate block_wal_vg1 /dev/sde1'
......
# shell
[root@ceph1 ~]# vim vgs_wal.sh 
#!/bin/bash
for i in {1..13}
do
        vgcreate block_wal_vg$i /dev/sdb$i 2>&1 >/dev/null
done


VGS - db
# ansible ceph -m shell -a 'vgcreate block_db_vg1 /dev/sde5'
......
# shell
[root@ceph1 ~]# vim vgs_db.sh 
j=1
for i in {14..26}
do
        vgcreate block_db_vg$j /dev/sdb$i 2>&1 >/dev/null
        j=$[j+1]
done



LVS - wal
# ansible ceph -m shell -a 'lvcreate -n wallv1 -l 100%Free block_wal_vg1'
......
# shell
[root@ceph1 ~]# vim lvs_wal.sh 
#!/bin/bash
for i in {1..13}
do
        lvcreate -n wallv$i -l 100%Free block_wal_vg$i 2>&1 >/dev/null
done


LVS - db
# ansible ceph -m shell -a 'lvcreate -n dblv1 -l 100%Free block_db_vg1'
......
# shell
[root@ceph1 ~]# vim lvs_db.sh 
#!/bin/bash
for i in {1..13}
do
        lvcreate -n dblv$i -l 100%Free block_db_vg$i 2>&1 >/dev/null
done
 [root@ceph3 ~]# lsblk 
NAME                       MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda                          8:0    0   30G  0 disk 
├─sda1                       8:1    0    1G  0 part /boot
└─sda2                       8:2    0   29G  0 part 
  ├─centos-root            253:0    0   27G  0 lvm  /
  └─centos-swap            253:1    0    2G  0 lvm  [SWAP]
sdb                          8:16   0   50G  0 disk 
├─sdb1                       8:17   0    1G  0 part 
│ └─block_wal_vg1-wallv1   253:15   0 1020M  0 lvm  
├─sdb2                       8:18   0    1G  0 part 
│ └─block_wal_vg2-wallv2   253:16   0 1020M  0 lvm  
├─sdb3                       8:19   0    1G  0 part 
│ └─block_wal_vg3-wallv3   253:17   0 1020M  0 lvm  
├─sdb4                       8:20   0    1G  0 part 
│ └─block_wal_vg4-wallv4   253:18   0 1020M  0 lvm  
├─sdb5                       8:21   0    1G  0 part 
│ └─block_wal_vg5-wallv5   253:19   0 1020M  0 lvm  
├─sdb6                       8:22   0    1G  0 part 
│ └─block_wal_vg6-wallv6   253:20   0 1020M  0 lvm  
├─sdb7                       8:23   0    1G  0 part 
│ └─block_wal_vg7-wallv7   253:21   0 1020M  0 lvm  
├─sdb8                       8:24   0    1G  0 part 
│ └─block_wal_vg8-wallv8   253:22   0 1020M  0 lvm  
├─sdb9                       8:25   0    1G  0 part 
│ └─block_wal_vg9-wallv9   253:23   0 1020M  0 lvm  
├─sdb10                      8:26   0    1G  0 part 
│ └─block_wal_vg10-wallv10 253:24   0 1020M  0 lvm  
├─sdb11                      8:27   0    1G  0 part 
│ └─block_wal_vg11-wallv11 253:25   0 1020M  0 lvm  
├─sdb12                      8:28   0    1G  0 part 
│ └─block_wal_vg12-wallv12 253:26   0 1020M  0 lvm  
├─sdb13                      8:29   0    1G  0 part 
│ └─block_wal_vg13-wallv13 253:27   0 1020M  0 lvm  
├─sdb14                      8:30   0    1G  0 part 
│ └─block_db_vg1-dblv1     253:28   0 1020M  0 lvm  
├─sdb15                      8:31   0    1G  0 part 
│ └─block_db_vg2-dblv2     253:29   0 1020M  0 lvm  
├─sdb16                    259:0    0    1G  0 part 
│ └─block_db_vg3-dblv3     253:30   0 1020M  0 lvm  
├─sdb17                    259:1    0    1G  0 part 
│ └─block_db_vg4-dblv4     253:31   0 1020M  0 lvm  
├─sdb18                    259:2    0    1G  0 part 
│ └─block_db_vg5-dblv5     253:32   0 1020M  0 lvm  
├─sdb19                    259:3    0    1G  0 part 
│ └─block_db_vg6-dblv6     253:33   0 1020M  0 lvm  
├─sdb20                    259:4    0    1G  0 part 
│ └─block_db_vg7-dblv7     253:34   0 1020M  0 lvm  
├─sdb21                    259:5    0    1G  0 part 
│ └─block_db_vg8-dblv8     253:35   0 1020M  0 lvm  
├─sdb22                    259:6    0    1G  0 part 
│ └─block_db_vg9-dblv9     253:36   0 1020M  0 lvm  
├─sdb23                    259:7    0    1G  0 part 
│ └─block_db_vg10-dblv10   253:37   0 1020M  0 lvm  
├─sdb24                    259:8    0    1G  0 part 
│ └─block_db_vg11-dblv11   253:38   0 1020M  0 lvm  
├─sdb25                    259:9    0    1G  0 part 
│ └─block_db_vg12-dblv12   253:39   0 1020M  0 lvm  
└─sdb26                    259:10   0    1G  0 part 
  └─block_db_vg13-dblv13   253:40   0 1020M  0 lvm  
sdc                          8:32   0  150G  0 disk 
├─sdc1                       8:33   0   10G  0 part 
│ └─datavg1-datalv1        253:2    0   10G  0 lvm  
├─sdc2                       8:34   0   10G  0 part 
│ └─datavg2-datalv2        253:3    0   10G  0 lvm  
├─sdc3                       8:35   0   10G  0 part 
│ └─datavg3-datalv3        253:4    0   10G  0 lvm  
├─sdc4                       8:36   0   10G  0 part 
│ └─datavg4-datalv4        253:5    0   10G  0 lvm  
├─sdc5                       8:37   0   10G  0 part 
│ └─datavg5-datalv5        253:6    0   10G  0 lvm  
├─sdc6                       8:38   0   10G  0 part 
│ └─datavg6-datalv6        253:7    0   10G  0 lvm  
├─sdc7                       8:39   0   10G  0 part 
│ └─datavg7-datalv7        253:8    0   10G  0 lvm  
├─sdc8                       8:40   0   10G  0 part 
│ └─datavg8-datalv8        253:9    0   10G  0 lvm  
├─sdc9                       8:41   0   10G  0 part 
│ └─datavg9-datalv9        253:10   0   10G  0 lvm  
├─sdc10                      8:42   0   10G  0 part 
│ └─datavg10-datalv10      253:11   0   10G  0 lvm  
├─sdc11                      8:43   0   10G  0 part 
│ └─datavg11-datalv11      253:12   0   10G  0 lvm  
├─sdc12                      8:44   0   10G  0 part 
│ └─datavg12-datalv12      253:13   0   10G  0 lvm  
└─sdc13                      8:45   0   10G  0 part 
  └─datavg13-datalv13      253:14   0   10G  0 lvm  
sr0                         11:0    1  906M  0 rom

至此,所有LVM设置准备完毕!


创建集群

部署Mon

# 新建集群目录,放置ceph文件
[root@ceph1 ~]# mkdir /cluster;cd /cluster

# 初始化集群
[root@ceph1 cluster]# ceph-deploy new ceph1 ceph2 ceph3 --public-network=192.168.18.0/24 --cluster-network=10.0.2.0/24

# 部署mon
[root@ceph1 cluster]# ceph-deploy mon create-initial

# 将配置文件及密钥拷贝到其他 monitor 节点
[root@ceph1 cluster]# ceph-deploy admin ceph1 ceph2 ceph3

# 执行完毕以后,可通过 ceph -s 查看集群状态:

[root@ceph1 cluster]# ceph -s
  cluster:
    id:     5ae631a5-b4ee-4949-944b-e6e36bf1f950
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph3,ceph2
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0B
    usage:   0B used, 0B / 0B avail
    pgs:     

出现如上信息,表明集群配置成功。

部署OSD

ceph1:

ceph-deploy osd create ceph1 --bluestore --block-wal block_wal_vg1/wallv1 --block-db block_db_vg1/dblv1 --data datavg1/datalv1
ceph-deploy osd create ceph1 --bluestore --block-wal block_wal_vg2/wallv2 --block-db block_db_vg2/dblv2 --data datavg2/datalv2
ceph-deploy osd create ceph1 --bluestore --block-wal block_wal_vg3/wallv3 --block-db block_db_vg3/dblv3 --data datavg3/datalv3
.......
如此类推,直至创建13个OSD进程,注意LVM卷组要一致对应,建议逐条执行,及时发现报错信息

ceph2:

ceph-deploy osd create ceph2 --bluestore --block-wal block_wal_vg1/wallv1 --block-db block_db_vg1/dblv1 --data datavg1/datalv1
ceph-deploy osd create ceph2 --bluestore --block-wal block_wal_vg2/wallv2 --block-db block_db_vg2/dblv2 --data datavg2/datalv2
ceph-deploy osd create ceph2 --bluestore --block-wal block_wal_vg3/wallv3 --block-db block_db_vg3/dblv3 --data datavg3/datalv3
.......
如此类推,直至创建13个OSD进程,注意LVM卷组要一致对应,建议逐条执行,及时发现报错信息

ceph3:

ceph-deploy osd create ceph3 --bluestore --block-wal block_wal_vg1/wallv1 --block-db block_db_vg1/dblv1 --data datavg1/datalv1
ceph-deploy osd create ceph3 --bluestore --block-wal block_wal_vg2/wallv2 --block-db block_db_vg2/dblv2 --data datavg2/datalv2
ceph-deploy osd create ceph3 --bluestore --block-wal block_wal_vg3/wallv3 --block-db block_db_vg3/dblv3 --data datavg3/datalv3
.......
如此类推,直至创建13个OSD进程,注意LVM卷组要一致对应,建议逐条执行,及时发现报错信息

批量部署osd节点:

#!/bin/bash

node="ceph-node1"
for i in {1..13}
do
        ceph-deploy osd create $node --bluestore --block-wal block_wal_vg$i/wallv$i --block-db block_db_vg$i/dblv$i --data datavg$i/datalv$i
done

部署mgr(开启Dashboard)

Ceph Manager守护进程以活动/备用模式运行。部署其他管理器守护程序可确保如果一个守护程序或主机发生故障,另一守护程序或主机可以接管而不会中断服务。且提供实时监控的dashboard页面模块,建议开启

ceph-deploy mgr create ceph1 ceph2 ceph3

ceph mgr module enable dashboard

ceph dashboard create-self-signed-cert

ceph dashboard set-login-credentials ceph ceph
[root@ceph-node1 ~]# ceph mgr services 
{
    "dashboard": "https://ceph-node1:8443/",
    "prometheus": "http://ceph-node1:9283/"
}

[root@ceph-node1 ~]# ss -tulnp | grep mgr
tcp    LISTEN     0      128    10.0.0.10:6800                  *:*                   users:(("ceph-mgr",pid=997113,fd=26))
tcp    LISTEN     0      128    10.0.0.10:6801                  *:*                   users:(("ceph-mgr",pid=997113,fd=29))
tcp    LISTEN     0      5      [::]:9283               [::]:*                   users:(("ceph-mgr",pid=997113,fd=35))
tcp    LISTEN     0      5      [::]:8443               [::]:*                   users:(("ceph-mgr",pid=997113,fd=36))
[root@ceph-node1 ~]#

启用Prometheus模块 和grafana 关联

ceph mgr module enable prometheus

安装部署Prometheus与grafana

部署mds

要使用CephFS,至少需要一个元数据服务器。

ceph-deploy mds create ceph1 ceph2 ceph3

部署rgw

要使用Ceph的Ceph对象网关组件,必须部署rgw的实例。

ceph-deploy install —rgw

ceph-deploy rgw create ceph1 ceph2 ceph3


部署参考:

https://www.cnblogs.com/fang888/p/9056659.html

常见的集群维护操作:

查看ceph集群状态:

ceph -s

ceph health detail

mon管理

sudo systemctl start ceph-mon@mon-host
sudo systemctl stop ceph-mon@mon-host
sudo systemctl restart ceph-mon@mon-host
sudo systemctl status ceph-mon@mon-host

OSD管理

sudo systemctl start ceph-osd@*
sudo systemctl stop  ceph-osd@*
sudo systemctl restart ceph-osd@*
sudo systemctl status  ceph-osd@*

查看OSD 与HOST 的归属关系

ceph osd tree

查看OSD node上 所有OSD data目录和 挂载磁盘

ls /var/lib/ceph/osd

mount | grep osd

查看ceph日志

ls /var/log/ceph
ceph.audit.log      ceph-osd.26.log  ceph-osd.31.log  ceph-osd.36.log
ceph.log            ceph-osd.27.log  ceph-osd.32.log  ceph-osd.37.log
ceph-mds.ceph3.log  ceph-osd.28.log  ceph-osd.33.log  ceph-osd.38.log
ceph-mgr.ceph3.log  ceph-osd.29.log  ceph-osd.34.log  ceph-volume.log
ceph-mon.ceph3.log  ceph-osd.30.log  ceph-osd.35.log

块设备

创建存储池

[root@ceph1 ~]# ceph osd pool create rbd 64 64

关于PG_num和PGP_num的数值参考:

通常在创建pool之前,需要覆盖默认的pg_num,官方推荐:

  • 若少于5个OSD, 设置pg_num为128。
  • 5~10个OSD,设置pg_num为512。
  • 10~50个OSD,设置pg_num为4096。
  • 超过50个OSD,可以参考pgcalc计算。

创建块设备映像

[root@ceph1 ~]# rbd create rbd/testrbd --size=4G

罗列信息

[root@ceph1 ~]# rbd list
testrbd
[root@ceph1 ~]# rbd ls rbd
testrbd
[root@ceph1 ~]# rbd info rbd/testrbd
rbd image 'testrbd':
    size 4GiB in 1024 objects
    order 22 (4MiB objects)
    block_name_prefix: rbd_data.5e7a6b8b4567
    format: 2
    features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
    flags: 
    create_timestamp: Sun Jun 28 09:41:57 2020

映射块设备

[root@ceph3 ~]# rbd map rbd/testrbd

查看已映射的块设备

[root@ceph3 ~]# rbd showmapped
id pool image   snap device    
0  rbd  testrbd -    /dev/rbd0

取消块设备映射

[root@ceph3 ~]# rbd unmap /dev/rbd/rbd/testrbd

使用块设备,格式化、挂载、写入文件

[root@ceph3 ~]# mkfs.xfs /dev/rbd0
meta-data=/dev/rbd0              isize=512    agcount=8, agsize=131072 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=1        finobt=0, sparse=0
data     =                       bsize=4096   blocks=1048576, imaxpct=25
         =                       sunit=1024   swidth=1024 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal log           bsize=4096   blocks=2560, version=2
         =                       sectsz=512   sunit=8 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@ceph3 ~]# mount /dev/rbd0 /mnt/rbd/
[root@ceph3 ~]# dd if=/dev/zero of=/mnt/rbd/file bs=100M count=1 oflag=direct
1+0 records in
1+0 records out
104857600 bytes (105 MB) copied, 9.58329 s, 10.9 MB/s
[root@ceph3 ~]# rados df
POOL_NAME   USED    OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD      WR_OPS WR     
fs_data          0B       0      0      0                  0       0        0      0      0B      0     0B 
fs_metadata 2.19KiB      21      0     63                  0       0        0      0      0B     44   8KiB 
rbd          114MiB      41      0    123                  0       0        0   9129 7.95MiB    149 113MiB 

total_objects    62
total_used       42.2GiB
total_avail      348GiB
total_space      390GiB

创建快照

[root@ceph3 ~]# rbd snap create --snap mysnmp rbd/testrbd
[root@ceph3 ~]# rbd snap ls rbd/testrbd
SNAPID NAME   SIZE TIMESTAMP                
     4 mysnmp 4GiB Sun Jun 28 10:40:45 2020

快照回滚

删除文件然后卸载块,进行回滚
[root@ceph3 ~]# rm -rf /mnt/rbd/file 
[root@ceph3 ~]# ls -l /mnt/rbd/
total 0
[root@ceph3 ~]# umount /mnt/rbd

[root@ceph3 ~]# rbd snap rollback rbd/testrbd@mysnmp
Rolling back to snapshot: 100% complete...done.
[root@ceph3 ~]# mount /dev/rbd0 /mnt/rbd
[root@ceph3 ~]# ll /mnt/rbd
total 102400
-rw-r--r--. 1 root root 104857600 Jun 28 10:34 file
重新挂载/dev/rbd0然后发现!!file这个文件又出来啦!

模板与克隆

把该块做成模板,首先要把做成模板的快照做成protect(重要!!!)
[root@ceph3 ~]# rbd snap protect rbd/testrbd@mysnap
取消挂载块
[root@ceph3 ~]# umount /dev/rbd0
克隆
[root@ceph3 ~]# rbd clone rbd/testrbd@mysnap rbd/testrbd2
[root@ceph3 ~]# rbd -p rbd ls
testrbd
testrbd2

存在依赖关系的子镜像克隆
[root@ceph3 rbd2]# rbd info rbd/testrbd2
rbd image 'testrbd2':
    size 4GiB in 1024 objects
    order 22 (4MiB objects)
    block_name_prefix: rbd_data.13216b8b4567
    format: 2
    features: layering
    flags: 
    create_timestamp: Sun Jun 28 11:14:32 2020
    parent: rbd/testrbd@mysnap
    overlap: 4GiB

克隆镜像的独立分离
[root@ceph3 rbd]# rbd flatten rbd/testrbd2
Image flatten: 100% complete...done.

导入和导出RBD image

导出RBD image

[root@ceph3 ~]# rbd export rbd/testrbd /tmp/rbd_backup
Exporting image: 100% complete...done.
[root@ceph3 tmp]# ll /tmp/rbd_backup 
-rw-r--r--. 1 root root 4294967296 Jun 28 11:48 /tmp/rbd_backup

导入RBD image

rbd import /tmp/foo_image_export rbd/bar_image --image-format 2

[root@ceph3 tmp]# rbd ls rbd -l
NAME           SIZE PARENT FMT PROT LOCK 
bar_image      4GiB          2           
testrbd        4GiB          2           
testrbd@mysnap 4GiB          2 yes       
testrbd2       4GiB          2

RBD image的导出和导入常用于RBD块设备的简单的备份与恢复。

性能测试

Rados性能测试

写测试
[root@ceph2 ~]# rados bench -p rbd 10 write --no-cleanup

读测试
[root@ceph2 ~]# rados bench -p rbd 10 seq/rand

rbd性能测试

[root@ceph2 ~]# rbd bench-write rbd/testrbd

常见故障解决

health HEALTH_WARN too few PGs per OSD

查看存储池

[root@ceph1 ~]# ceph osd lspools
2 rbd,
[root@ceph1 ~]#

通过排查发现rbd的池为64

[root@ceph1 ~]# ceph osd pool get rbd pgp_num
pgp_num: 64
[root@ceph1 ~]# ceph osd pool get rbd pg_num
pg_num: 64

pgs为64,因为是3副本的配置,有8个osd的时候,
每个osd上均分了**64/8 3=24个pgs,也就是出现了如上的错误 小于最小配置30个

解决方法:修改存储池的pg_num和pgp_num

[root@ceph1 ~]# ceph osd pool set rbd pg_num 512
[root@ceph1 ~]# ceph osd pool set rbd pgp_num 512
[root@ceph1 ~]# ceph -s 
  cluster:
    id:     5ae631a5-b4ee-4949-944b-e6e36bf1f950
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum ceph1,ceph3,ceph2
    mgr: ceph1(active), standbys: ceph2, ceph3
    osd: 39 osds: 39 up, 39 in

  data:
    pools:   1 pools, 400 pgs
    objects: 0 objects, 0B
    usage:   42.1GiB used, 348GiB / 390GiB avail
    pgs:     400 active+clean

关于PG_num和PGP_num的数值参考:

通常在创建pool之前,需要覆盖默认的pg_num,官方推荐:

  • 若少于5个OSD, 设置pg_num为128。
  • 5~10个OSD,设置pg_num为512。
  • 10~50个OSD,设置pg_num为4096。
  • 超过50个OSD,可以参考pgcalc计算。

附:Ceph之PG数调整

清理 清除ceph环境


一、安装ceph-deploy软件
dnf install ceph-deploy -y

二、软件环境:三个节点环境

#卸载ceph软件包
ceph-deploy purge controller1
ceph-deploy purge controller2
ceph-deploy purge controller3

#删除各种配置文件和生成的数据文件
#controller1上面执行
ceph-deploy purgedata controller1
#controller2上面执行
ceph-deploy purgedata controller2
#controller3上面执行
ceph-deploy purgedata controller3

#将卸载节点的认证密钥从本地目录移除
ceph-deploy forgetkeys<br><br>#检查ceph-mon是否启动
ps -ef|grep ceph  or ps -A|grep ceph<br><br>#启动ceph-mon
ceph-mon --id=1

三、软件环境:单节点环境
ceph-deploy purge controller1
ceph-deploy purgedata controller1
ceph-deploy forgetkeys

ceph-ansible批量部署

firewalld

firewall-cmd --zone=public --add-port=6789/tcp --permanent
firewall-cmd --zone=public --add-port=6800-7100/tcp --permanent
firewall-cmd --reload
firewall-cmd --zone=public --list-all

解决ceph节点因为断开SSH网络链接导致mon和osd守护进程自动down的问题

vim /etc/sysconfig/network-scripts/ifcfg-ib0

CONNECTED_MODE=no
TYPE=InfiniBand
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ib0
UUID=2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89
DEVICE=ib0
ONBOOT=yes
IPADDR=10.0.0.20
NETMASK=255.255.255.0
#USERS=ROOT    //删除了此参数