硬件配置
配置 | 详情 |
---|---|
服务器 | Dell PowerEdge R630 |
CPU | E5-2603 v4 * 2 |
内存 | 128GB |
OS Disk | SEAGATE 300G SAS * 2 (RAID1) |
SSD Disk | Intel DC S3710(400GB) * 1 |
OSD Disk | Dell HDD(SAS) 2.4T 2.5英寸 * 5 |
网卡 | 万兆网卡 2 (public_network + cluster_network)+ 公网IP 1 + IPMI管理口 |
OS | CentOS Linux release 7.8.2003 (Core) |
Kernel | 3.10.0-1127.el7.x86_64 |
Ceph | v14.2.10 (nautilus) |
ceph-deploy | 2.0.1 |
系统架构
Hostname | IP_Address | Services |
---|---|---|
Ceph-node1 | 10.0.0.10,10.0.1.10 | admin,osd,mon,mgr,mds |
Ceph-node2 | 10.0.0.20,10.0.1.20 | osd,mon,mgr,mds |
Ceph-node3 | 10.0.0.30,10.0.1.30 | osd,mon,mgr,mds |
Ceph-client | 10.0.1.40 | / |
部署流程
系统预检(所有节点)
配置yum源
添加CentOS7的阿里源
#curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
#curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo
配置ceph源
# vim /etc/yum.repos.d/ceph.repo
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/$basearch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
安装必要组件
安装系统基本组件
[root@Ceph1 yum.repos.d]# yum install vim net-tools lrzsz htop sysstat iotop iftop -y
安装ceph-deploy
# yum list ceph-deploy
Loaded plugins: fastestmirror, priorities
Determining fastest mirrors
8 packages excluded due to repository priority protections
Installed Packages
ceph-deploy.noarch 2.0.1-0 @Ceph-noarc
# yum install ceph-deploy
# Kernel pid max
echo 4194303 > /proc/sys/kernel/pid_max
修改hosts
# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.10 ceph-node1
10.0.0.20 ceph-node2
10.0.0.30 ceph-node3
防火墙
systemctl stop firewalld.service
systemctl disable firewalld.service
Selinux
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
setenforce 0
NTP时间同步
# 配置定时的时间同步
# crontab -e
*/1 * * * * /usr/sbin/ntpdate 1.1.1.1 >/dev/null 2>&1
配置SSH免密登录
# ssh-keygen
# ssh-copy-id root@Ceph-node1
# ssh-copy-id root@Ceph-node2
# ssh-copy-id root@Ceph-node3
# vim ~/.ssh/config
Host Ceph-node1
Hostname Ceph-node1
User root
Host Ceph-node2
Hostname Ceph-node2
User root
Host Ceph-node3
Hostname Ceph-node3
User root
可提前预先安装Ceph和Ceph-deploy管理组件(所有节点):
yum install ceph -y
# 执行这条可能会安装到1.x版本的ceph-deploy
yum install ceph-deploy -y
# 可选2.x版本
yum install -y http://mirrors.aliyun.com/ceph/rpm-mimic/el7/noarch/ceph-deploy-2.0.1-0.noarch.rpm
磁盘分区
磁盘分区仅供测试环境参考:一切以实际的生产环境为准
部署buluestore中的block-db与block-wal的分配比例参考
在 luminous 版本中,是用 ceph-volume 管理 OSD ,官方也推荐使用 lvm 管理磁盘,设置 LVM更加方便后面的批量部署和管理,这些都会在后面的命令有所体现
所以实际生产环境中,还是强烈建议使用LVM的方式管理分区和逻辑卷组!
【当然,在bluestore的部署中直接指定硬盘分区也是可以的(如果觉得LVM比较麻烦或者单纯用来测试的话)】
注意:
对data数据盘/dev/sdc进行分区处理是为了模拟生产环境中的多个osd,实际生产环境osd的数据盘并不需要分区。
如果下面出现ansible命令和shell命令的方式,只选择其中一种方式即可
OSD LVM
# VGS
#ansible:
[root@ceph-node1 ~]# ansible ceph -m shell -a 'vgcreate datavg1 /dev/sdb'
......
#shell
[root@ceph1 ~]# vim vgs.sh
#!/bin/bash
for i in {1..13}
do
vgcreate datavg$i /dev/sdc$i 2>&1 >/dev/null
done
# LVS
# ansible:
[root@ceph-node1 ~]# ansible ceph -m shell -a 'lvcreate -n datalv1 -l 100%Free datavg1'
......
#shell
[root@ceph1 ~]# vim lvs.sh
#!/bin/bash
for i in {1..13}
do
lvcreate datalv$i -l 100%Free datavg$i 2>&1 >/dev/null
done
wal/db LVM
VGS - wal
# ansible ceph -m shell -a 'vgcreate block_wal_vg1 /dev/sde1'
......
# shell
[root@ceph1 ~]# vim vgs_wal.sh
#!/bin/bash
for i in {1..13}
do
vgcreate block_wal_vg$i /dev/sdb$i 2>&1 >/dev/null
done
VGS - db
# ansible ceph -m shell -a 'vgcreate block_db_vg1 /dev/sde5'
......
# shell
[root@ceph1 ~]# vim vgs_db.sh
j=1
for i in {14..26}
do
vgcreate block_db_vg$j /dev/sdb$i 2>&1 >/dev/null
j=$[j+1]
done
LVS - wal
# ansible ceph -m shell -a 'lvcreate -n wallv1 -l 100%Free block_wal_vg1'
......
# shell
[root@ceph1 ~]# vim lvs_wal.sh
#!/bin/bash
for i in {1..13}
do
lvcreate -n wallv$i -l 100%Free block_wal_vg$i 2>&1 >/dev/null
done
LVS - db
# ansible ceph -m shell -a 'lvcreate -n dblv1 -l 100%Free block_db_vg1'
......
# shell
[root@ceph1 ~]# vim lvs_db.sh
#!/bin/bash
for i in {1..13}
do
lvcreate -n dblv$i -l 100%Free block_db_vg$i 2>&1 >/dev/null
done
[root@ceph3 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 30G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 29G 0 part
├─centos-root 253:0 0 27G 0 lvm /
└─centos-swap 253:1 0 2G 0 lvm [SWAP]
sdb 8:16 0 50G 0 disk
├─sdb1 8:17 0 1G 0 part
│ └─block_wal_vg1-wallv1 253:15 0 1020M 0 lvm
├─sdb2 8:18 0 1G 0 part
│ └─block_wal_vg2-wallv2 253:16 0 1020M 0 lvm
├─sdb3 8:19 0 1G 0 part
│ └─block_wal_vg3-wallv3 253:17 0 1020M 0 lvm
├─sdb4 8:20 0 1G 0 part
│ └─block_wal_vg4-wallv4 253:18 0 1020M 0 lvm
├─sdb5 8:21 0 1G 0 part
│ └─block_wal_vg5-wallv5 253:19 0 1020M 0 lvm
├─sdb6 8:22 0 1G 0 part
│ └─block_wal_vg6-wallv6 253:20 0 1020M 0 lvm
├─sdb7 8:23 0 1G 0 part
│ └─block_wal_vg7-wallv7 253:21 0 1020M 0 lvm
├─sdb8 8:24 0 1G 0 part
│ └─block_wal_vg8-wallv8 253:22 0 1020M 0 lvm
├─sdb9 8:25 0 1G 0 part
│ └─block_wal_vg9-wallv9 253:23 0 1020M 0 lvm
├─sdb10 8:26 0 1G 0 part
│ └─block_wal_vg10-wallv10 253:24 0 1020M 0 lvm
├─sdb11 8:27 0 1G 0 part
│ └─block_wal_vg11-wallv11 253:25 0 1020M 0 lvm
├─sdb12 8:28 0 1G 0 part
│ └─block_wal_vg12-wallv12 253:26 0 1020M 0 lvm
├─sdb13 8:29 0 1G 0 part
│ └─block_wal_vg13-wallv13 253:27 0 1020M 0 lvm
├─sdb14 8:30 0 1G 0 part
│ └─block_db_vg1-dblv1 253:28 0 1020M 0 lvm
├─sdb15 8:31 0 1G 0 part
│ └─block_db_vg2-dblv2 253:29 0 1020M 0 lvm
├─sdb16 259:0 0 1G 0 part
│ └─block_db_vg3-dblv3 253:30 0 1020M 0 lvm
├─sdb17 259:1 0 1G 0 part
│ └─block_db_vg4-dblv4 253:31 0 1020M 0 lvm
├─sdb18 259:2 0 1G 0 part
│ └─block_db_vg5-dblv5 253:32 0 1020M 0 lvm
├─sdb19 259:3 0 1G 0 part
│ └─block_db_vg6-dblv6 253:33 0 1020M 0 lvm
├─sdb20 259:4 0 1G 0 part
│ └─block_db_vg7-dblv7 253:34 0 1020M 0 lvm
├─sdb21 259:5 0 1G 0 part
│ └─block_db_vg8-dblv8 253:35 0 1020M 0 lvm
├─sdb22 259:6 0 1G 0 part
│ └─block_db_vg9-dblv9 253:36 0 1020M 0 lvm
├─sdb23 259:7 0 1G 0 part
│ └─block_db_vg10-dblv10 253:37 0 1020M 0 lvm
├─sdb24 259:8 0 1G 0 part
│ └─block_db_vg11-dblv11 253:38 0 1020M 0 lvm
├─sdb25 259:9 0 1G 0 part
│ └─block_db_vg12-dblv12 253:39 0 1020M 0 lvm
└─sdb26 259:10 0 1G 0 part
└─block_db_vg13-dblv13 253:40 0 1020M 0 lvm
sdc 8:32 0 150G 0 disk
├─sdc1 8:33 0 10G 0 part
│ └─datavg1-datalv1 253:2 0 10G 0 lvm
├─sdc2 8:34 0 10G 0 part
│ └─datavg2-datalv2 253:3 0 10G 0 lvm
├─sdc3 8:35 0 10G 0 part
│ └─datavg3-datalv3 253:4 0 10G 0 lvm
├─sdc4 8:36 0 10G 0 part
│ └─datavg4-datalv4 253:5 0 10G 0 lvm
├─sdc5 8:37 0 10G 0 part
│ └─datavg5-datalv5 253:6 0 10G 0 lvm
├─sdc6 8:38 0 10G 0 part
│ └─datavg6-datalv6 253:7 0 10G 0 lvm
├─sdc7 8:39 0 10G 0 part
│ └─datavg7-datalv7 253:8 0 10G 0 lvm
├─sdc8 8:40 0 10G 0 part
│ └─datavg8-datalv8 253:9 0 10G 0 lvm
├─sdc9 8:41 0 10G 0 part
│ └─datavg9-datalv9 253:10 0 10G 0 lvm
├─sdc10 8:42 0 10G 0 part
│ └─datavg10-datalv10 253:11 0 10G 0 lvm
├─sdc11 8:43 0 10G 0 part
│ └─datavg11-datalv11 253:12 0 10G 0 lvm
├─sdc12 8:44 0 10G 0 part
│ └─datavg12-datalv12 253:13 0 10G 0 lvm
└─sdc13 8:45 0 10G 0 part
└─datavg13-datalv13 253:14 0 10G 0 lvm
sr0 11:0 1 906M 0 rom
至此,所有LVM设置准备完毕!
创建集群
部署Mon
# 新建集群目录,放置ceph文件
[root@ceph1 ~]# mkdir /cluster;cd /cluster
# 初始化集群
[root@ceph1 cluster]# ceph-deploy new ceph1 ceph2 ceph3 --public-network=192.168.18.0/24 --cluster-network=10.0.2.0/24
# 部署mon
[root@ceph1 cluster]# ceph-deploy mon create-initial
# 将配置文件及密钥拷贝到其他 monitor 节点
[root@ceph1 cluster]# ceph-deploy admin ceph1 ceph2 ceph3
# 执行完毕以后,可通过 ceph -s 查看集群状态:
[root@ceph1 cluster]# ceph -s
cluster:
id: 5ae631a5-b4ee-4949-944b-e6e36bf1f950
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph1,ceph3,ceph2
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0B
usage: 0B used, 0B / 0B avail
pgs:
出现如上信息,表明集群配置成功。
部署OSD
ceph1:
ceph-deploy osd create ceph1 --bluestore --block-wal block_wal_vg1/wallv1 --block-db block_db_vg1/dblv1 --data datavg1/datalv1
ceph-deploy osd create ceph1 --bluestore --block-wal block_wal_vg2/wallv2 --block-db block_db_vg2/dblv2 --data datavg2/datalv2
ceph-deploy osd create ceph1 --bluestore --block-wal block_wal_vg3/wallv3 --block-db block_db_vg3/dblv3 --data datavg3/datalv3
.......
如此类推,直至创建13个OSD进程,注意LVM卷组要一致对应,建议逐条执行,及时发现报错信息
ceph2:
ceph-deploy osd create ceph2 --bluestore --block-wal block_wal_vg1/wallv1 --block-db block_db_vg1/dblv1 --data datavg1/datalv1
ceph-deploy osd create ceph2 --bluestore --block-wal block_wal_vg2/wallv2 --block-db block_db_vg2/dblv2 --data datavg2/datalv2
ceph-deploy osd create ceph2 --bluestore --block-wal block_wal_vg3/wallv3 --block-db block_db_vg3/dblv3 --data datavg3/datalv3
.......
如此类推,直至创建13个OSD进程,注意LVM卷组要一致对应,建议逐条执行,及时发现报错信息
ceph3:
ceph-deploy osd create ceph3 --bluestore --block-wal block_wal_vg1/wallv1 --block-db block_db_vg1/dblv1 --data datavg1/datalv1
ceph-deploy osd create ceph3 --bluestore --block-wal block_wal_vg2/wallv2 --block-db block_db_vg2/dblv2 --data datavg2/datalv2
ceph-deploy osd create ceph3 --bluestore --block-wal block_wal_vg3/wallv3 --block-db block_db_vg3/dblv3 --data datavg3/datalv3
.......
如此类推,直至创建13个OSD进程,注意LVM卷组要一致对应,建议逐条执行,及时发现报错信息
批量部署osd节点:
#!/bin/bash
node="ceph-node1"
for i in {1..13}
do
ceph-deploy osd create $node --bluestore --block-wal block_wal_vg$i/wallv$i --block-db block_db_vg$i/dblv$i --data datavg$i/datalv$i
done
部署mgr(开启Dashboard)
Ceph Manager守护进程以活动/备用模式运行。部署其他管理器守护程序可确保如果一个守护程序或主机发生故障,另一守护程序或主机可以接管而不会中断服务。且提供实时监控的dashboard页面模块,建议开启
ceph-deploy mgr create ceph1 ceph2 ceph3
ceph mgr module enable dashboard
ceph dashboard create-self-signed-cert
ceph dashboard set-login-credentials ceph ceph
[root@ceph-node1 ~]# ceph mgr services
{
"dashboard": "https://ceph-node1:8443/",
"prometheus": "http://ceph-node1:9283/"
}
[root@ceph-node1 ~]# ss -tulnp | grep mgr
tcp LISTEN 0 128 10.0.0.10:6800 *:* users:(("ceph-mgr",pid=997113,fd=26))
tcp LISTEN 0 128 10.0.0.10:6801 *:* users:(("ceph-mgr",pid=997113,fd=29))
tcp LISTEN 0 5 [::]:9283 [::]:* users:(("ceph-mgr",pid=997113,fd=35))
tcp LISTEN 0 5 [::]:8443 [::]:* users:(("ceph-mgr",pid=997113,fd=36))
[root@ceph-node1 ~]#
启用Prometheus模块 和grafana 关联
ceph mgr module enable prometheus
安装部署Prometheus与grafana
部署mds
要使用CephFS,至少需要一个元数据服务器。
ceph-deploy mds create ceph1 ceph2 ceph3
部署rgw
要使用Ceph的Ceph对象网关组件,必须部署rgw的实例。
ceph-deploy install —rgw
ceph-deploy rgw create ceph1 ceph2 ceph3
部署参考:
https://www.cnblogs.com/fang888/p/9056659.html
常见的集群维护操作:
查看ceph集群状态:
ceph -s
ceph health detail
mon管理
sudo systemctl start ceph-mon@mon-host
sudo systemctl stop ceph-mon@mon-host
sudo systemctl restart ceph-mon@mon-host
sudo systemctl status ceph-mon@mon-host
OSD管理
sudo systemctl start ceph-osd@*
sudo systemctl stop ceph-osd@*
sudo systemctl restart ceph-osd@*
sudo systemctl status ceph-osd@*
查看OSD 与HOST 的归属关系
ceph osd tree
查看OSD node上 所有OSD data目录和 挂载磁盘
ls /var/lib/ceph/osd
mount | grep osd
查看ceph日志
ls /var/log/ceph
ceph.audit.log ceph-osd.26.log ceph-osd.31.log ceph-osd.36.log
ceph.log ceph-osd.27.log ceph-osd.32.log ceph-osd.37.log
ceph-mds.ceph3.log ceph-osd.28.log ceph-osd.33.log ceph-osd.38.log
ceph-mgr.ceph3.log ceph-osd.29.log ceph-osd.34.log ceph-volume.log
ceph-mon.ceph3.log ceph-osd.30.log ceph-osd.35.log
块设备
创建存储池
[root@ceph1 ~]# ceph osd pool create rbd 64 64
关于PG_num和PGP_num的数值参考:
通常在创建pool之前,需要覆盖默认的pg_num
,官方推荐:
- 若少于5个OSD, 设置pg_num为128。
- 5~10个OSD,设置pg_num为512。
- 10~50个OSD,设置pg_num为4096。
- 超过50个OSD,可以参考pgcalc计算。
创建块设备映像
[root@ceph1 ~]# rbd create rbd/testrbd --size=4G
罗列信息
[root@ceph1 ~]# rbd list
testrbd
[root@ceph1 ~]# rbd ls rbd
testrbd
[root@ceph1 ~]# rbd info rbd/testrbd
rbd image 'testrbd':
size 4GiB in 1024 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.5e7a6b8b4567
format: 2
features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
flags:
create_timestamp: Sun Jun 28 09:41:57 2020
映射块设备
[root@ceph3 ~]# rbd map rbd/testrbd
查看已映射的块设备
[root@ceph3 ~]# rbd showmapped
id pool image snap device
0 rbd testrbd - /dev/rbd0
取消块设备映射
[root@ceph3 ~]# rbd unmap /dev/rbd/rbd/testrbd
使用块设备,格式化、挂载、写入文件
[root@ceph3 ~]# mkfs.xfs /dev/rbd0
meta-data=/dev/rbd0 isize=512 agcount=8, agsize=131072 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=1048576, imaxpct=25
= sunit=1024 swidth=1024 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=2560, version=2
= sectsz=512 sunit=8 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@ceph3 ~]# mount /dev/rbd0 /mnt/rbd/
[root@ceph3 ~]# dd if=/dev/zero of=/mnt/rbd/file bs=100M count=1 oflag=direct
1+0 records in
1+0 records out
104857600 bytes (105 MB) copied, 9.58329 s, 10.9 MB/s
[root@ceph3 ~]# rados df
POOL_NAME USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS RD WR_OPS WR
fs_data 0B 0 0 0 0 0 0 0 0B 0 0B
fs_metadata 2.19KiB 21 0 63 0 0 0 0 0B 44 8KiB
rbd 114MiB 41 0 123 0 0 0 9129 7.95MiB 149 113MiB
total_objects 62
total_used 42.2GiB
total_avail 348GiB
total_space 390GiB
创建快照
[root@ceph3 ~]# rbd snap create --snap mysnmp rbd/testrbd
[root@ceph3 ~]# rbd snap ls rbd/testrbd
SNAPID NAME SIZE TIMESTAMP
4 mysnmp 4GiB Sun Jun 28 10:40:45 2020
快照回滚
删除文件然后卸载块,进行回滚
[root@ceph3 ~]# rm -rf /mnt/rbd/file
[root@ceph3 ~]# ls -l /mnt/rbd/
total 0
[root@ceph3 ~]# umount /mnt/rbd
[root@ceph3 ~]# rbd snap rollback rbd/testrbd@mysnmp
Rolling back to snapshot: 100% complete...done.
[root@ceph3 ~]# mount /dev/rbd0 /mnt/rbd
[root@ceph3 ~]# ll /mnt/rbd
total 102400
-rw-r--r--. 1 root root 104857600 Jun 28 10:34 file
重新挂载/dev/rbd0然后发现!!file这个文件又出来啦!
模板与克隆
把该块做成模板,首先要把做成模板的快照做成protect(重要!!!)
[root@ceph3 ~]# rbd snap protect rbd/testrbd@mysnap
取消挂载块
[root@ceph3 ~]# umount /dev/rbd0
克隆
[root@ceph3 ~]# rbd clone rbd/testrbd@mysnap rbd/testrbd2
[root@ceph3 ~]# rbd -p rbd ls
testrbd
testrbd2
存在依赖关系的子镜像克隆
[root@ceph3 rbd2]# rbd info rbd/testrbd2
rbd image 'testrbd2':
size 4GiB in 1024 objects
order 22 (4MiB objects)
block_name_prefix: rbd_data.13216b8b4567
format: 2
features: layering
flags:
create_timestamp: Sun Jun 28 11:14:32 2020
parent: rbd/testrbd@mysnap
overlap: 4GiB
克隆镜像的独立分离
[root@ceph3 rbd]# rbd flatten rbd/testrbd2
Image flatten: 100% complete...done.
导入和导出RBD image
导出RBD image
[root@ceph3 ~]# rbd export rbd/testrbd /tmp/rbd_backup
Exporting image: 100% complete...done.
[root@ceph3 tmp]# ll /tmp/rbd_backup
-rw-r--r--. 1 root root 4294967296 Jun 28 11:48 /tmp/rbd_backup
导入RBD image
rbd import /tmp/foo_image_export rbd/bar_image --image-format 2
[root@ceph3 tmp]# rbd ls rbd -l
NAME SIZE PARENT FMT PROT LOCK
bar_image 4GiB 2
testrbd 4GiB 2
testrbd@mysnap 4GiB 2 yes
testrbd2 4GiB 2
RBD image的导出和导入常用于RBD块设备的简单的备份与恢复。
性能测试
Rados性能测试
写测试
[root@ceph2 ~]# rados bench -p rbd 10 write --no-cleanup
读测试
[root@ceph2 ~]# rados bench -p rbd 10 seq/rand
rbd性能测试
[root@ceph2 ~]# rbd bench-write rbd/testrbd
常见故障解决
health HEALTH_WARN too few PGs per OSD
查看存储池
[root@ceph1 ~]# ceph osd lspools
2 rbd,
[root@ceph1 ~]#
通过排查发现rbd的池为64
[root@ceph1 ~]# ceph osd pool get rbd pgp_num
pgp_num: 64
[root@ceph1 ~]# ceph osd pool get rbd pg_num
pg_num: 64
pgs为64,因为是3副本的配置,有8个osd的时候,
每个osd上均分了**64/8 3=24个pgs,也就是出现了如上的错误 小于最小配置30个
解决方法:修改存储池的pg_num和pgp_num
[root@ceph1 ~]# ceph osd pool set rbd pg_num 512
[root@ceph1 ~]# ceph osd pool set rbd pgp_num 512
[root@ceph1 ~]# ceph -s
cluster:
id: 5ae631a5-b4ee-4949-944b-e6e36bf1f950
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph1,ceph3,ceph2
mgr: ceph1(active), standbys: ceph2, ceph3
osd: 39 osds: 39 up, 39 in
data:
pools: 1 pools, 400 pgs
objects: 0 objects, 0B
usage: 42.1GiB used, 348GiB / 390GiB avail
pgs: 400 active+clean
关于PG_num和PGP_num的数值参考:
通常在创建pool之前,需要覆盖默认的pg_num
,官方推荐:
- 若少于5个OSD, 设置pg_num为128。
- 5~10个OSD,设置pg_num为512。
- 10~50个OSD,设置pg_num为4096。
- 超过50个OSD,可以参考pgcalc计算。
清理 清除ceph环境
一、安装ceph-deploy软件
dnf install ceph-deploy -y
二、软件环境:三个节点环境
#卸载ceph软件包
ceph-deploy purge controller1
ceph-deploy purge controller2
ceph-deploy purge controller3
#删除各种配置文件和生成的数据文件
#controller1上面执行
ceph-deploy purgedata controller1
#controller2上面执行
ceph-deploy purgedata controller2
#controller3上面执行
ceph-deploy purgedata controller3
#将卸载节点的认证密钥从本地目录移除
ceph-deploy forgetkeys<br><br>#检查ceph-mon是否启动
ps -ef|grep ceph or ps -A|grep ceph<br><br>#启动ceph-mon
ceph-mon --id=1
三、软件环境:单节点环境
ceph-deploy purge controller1
ceph-deploy purgedata controller1
ceph-deploy forgetkeys
ceph-ansible批量部署
firewalld
firewall-cmd --zone=public --add-port=6789/tcp --permanent
firewall-cmd --zone=public --add-port=6800-7100/tcp --permanent
firewall-cmd --reload
firewall-cmd --zone=public --list-all
解决ceph节点因为断开SSH网络链接导致mon和osd守护进程自动down的问题
vim /etc/sysconfig/network-scripts/ifcfg-ib0
CONNECTED_MODE=no
TYPE=InfiniBand
PROXY_METHOD=none
BROWSER_ONLY=no
BOOTPROTO=static
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ib0
UUID=2ab4abde-b8a5-6cbc-19b1-2bfb193e4e89
DEVICE=ib0
ONBOOT=yes
IPADDR=10.0.0.20
NETMASK=255.255.255.0
#USERS=ROOT //删除了此参数