硬件平台:
型号 | Dell PowerEdge R730 |
---|---|
CPU | Intel(R) Xeon(R) CPU E5-2683 v4 @ 2.10GHz |
Memory | 64GB |
Disk | 6 2.5英寸 【 1 512G SSD【Journal】 + 5 * 1T SAS(OSD)】 |
Network | 1GB Ethernet NICs + 40GB Ethernet NICs |
注:以实际生产环境考虑,512GB的SSD作为日志盘,block-db和block-wal的大小可以分40G+60G(分区的大小的比例可自行规划),但必须要保证每一块OSD数据盘都能获得日志盘提供的性能。
官方的建议是调整block.db 是主设备 4% ,而block.wal分为6%左右,2个加起来大约10%左右,且SSD:OSD物理设备的比例应为1:4
软件版本:
[root@Ceph1 ~]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)
[root@ceph-node2 ~]# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)
[root@ceph-node2 ~]# uname -r
3.10.0-862.el7.x86_64
[root@ceph1 cluster]# ceph -v
ceph version 14.2.10 (b340acf629a010a74d90da5782a2c5fe0b54ac20) nautilus (stable)
[root@ceph1 cluster]# ceph-deploy --version
2.0.1
测试服务器架构:
Hostname | IP_Address | Services | OS | Disk |
---|---|---|---|---|
ceph-node1 | 10.0.0.10/24 | admin,osd,mon,mgr,mds,rgw | Centos7.5-mininal | 512G +1 * 5T |
ceph-node2 | 10.0.0.20/24 | osd,mon,mds | Centos7.5-mininal | 512G +1 * 5T |
ceph-node3 | 183.60.201.181/25,10.0.0.30/24 | osd,mon,mds | Centos7.5-mininal | 512G +1 * 5T |
环境说明:
512G SSD固态硬盘用于生产环境中的日志盘部署bluestore,其余1T SAS用于部署osd
双网卡用于公共网络和内网的集群网络
Ceph安装流程
(所有节点)
关闭selinux
[root@ceph-node1 ~]# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
[root@ceph-node1 ~]# setenforce 0
[root@ceph-node1 ~]# getenforce
Permissive
关闭防火墙
[root@ceph-node1 ~]# systemctl stop firewalld
[root@ceph-node1 ~]# systemctl disable firewalld
Removed symlink /etc/systemd/system/multi-user.target.wants/firewalld.service.
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
[root@ceph-node1 ~]# systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
Active: inactive (dead)
Docs: man:firewalld(1)
Jun 30 11:22:26 localhost.localdomain systemd[1]: Starting firewalld - dynamic firewa....
Jun 30 11:22:26 localhost.localdomain systemd[1]: Started firewalld - dynamic firewal....
Jun 30 13:57:25 ceph-node1 systemd[1]: Stopping firewalld - dynamic firewall daemon...
Jun 30 13:57:26 ceph-node1 systemd[1]: Stopped firewalld - dynamic firewall daemon.
Hint: Some lines were ellipsized, use -l to show in full.
配置yum源(所有节点)
备份默认的yum源文件
[root@ceph-node1 yum.repos.d]# pwd
/etc/yum.repos.d
[root@ceph-node1 yum.repos.d]# ll b
total 32
-rw-r--r--. 1 root root 1664 Jun 30 14:04 CentOS-Base.repo
-rw-r--r--. 1 root root 1309 Apr 29 2018 CentOS-CR.repo
-rw-r--r--. 1 root root 649 Apr 29 2018 CentOS-Debuginfo.repo
-rw-r--r--. 1 root root 314 Apr 29 2018 CentOS-fasttrack.repo
-rw-r--r--. 1 root root 630 Apr 29 2018 CentOS-Media.repo
-rw-r--r--. 1 root root 1331 Apr 29 2018 CentOS-Sources.repo
-rw-r--r--. 1 root root 4768 Apr 29 2018 CentOS-Vault.repo
获取阿里源
curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d'
安装epel源
[epel]
name=Extra Packages for Enterprise Linux 7 - $basearch
baseurl=http://mirrors.aliyun.com/epel/7/$basearch
failovermethod=priority
enabled=1
gpgcheck=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
[epel-debuginfo]
name=Extra Packages for Enterprise Linux 7 - $basearch - Debug
baseurl=http://mirrors.aliyun.com/epel/7/$basearch/debug
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=0
[epel-source]
name=Extra Packages for Enterprise Linux 7 - $basearch - Source
baseurl=http://mirrors.aliyun.com/epel/7/SRPMS
failovermethod=priority
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-EPEL-7
gpgcheck=0
配置ceph源
[Ceph]
name=Ceph packages for $basearch
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/$basearch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
[Ceph-noarch]
name=Ceph noarch packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/noarch
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
[ceph-source]
name=Ceph source packages
baseurl=http://mirrors.aliyun.com/ceph/rpm-nautilus/el7/SRPMS
enabled=1
gpgcheck=0
type=rpm-md
gpgkey=https://mirrors.aliyun.com/ceph/keys/release.asc
priority=1
yum clean all
yun makecache
[root@ceph-node1 yum.repos.d]# yum install vim net-tools wget ntpdate htop sysstat iotop iftop lrzsz -y
有需要的话可以更新内核(可选)
mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.repo
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
yum clean all && yum makecache && yum update -y
NTP时间同步(所有节点)
# 同步物理时钟
[root@ceph-node1 ~]# vim /etc/sysconfig/ntpdate
修改 SYNC_HWCLOCK=yes
# 手动同步NTP服务器
[root@ceph-node1 ~]# ntpdate -uq 1.1.1.1
30 Jun 14:23:29 ntpdate[3900]: adjust time server 1.1.1.1 offset 0.000984 sec
# 定时执行ntpdate命令完成时间同步
# crontab -e
*/1 * * * * /usr/sbin/ntpdate -uq 1.1.1.1 >/dev/null 2>&1
Ceph部署预检(所有节点)
修改/etc/hosts
[root@ceph-node1 ~]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
10.0.0.10 ceph-node1
10.0.0.20 ceph-node2
10.0.0.30 ceph-node3
配置SSH免密登录(ceph-deploy管理节点)
[root@ceph-node1 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:hMeHlyo7VZ6SgrYsHCcsd4KJuRCZNG5pyodRRPa1XD4 root@ceph-node1
The key's randomart image is:
+---[RSA 2048]----+
| oo= . . |
|oo= . oo+. . |
|+* ..o=E= |
|**o . o B.. |
|Oo*.= o S o |
|.=.O . = . |
|. o o o |
| . . |
| |
+----[SHA256]-----+
ssh-copy-id root@ceph-node1
ssh-copy-id root@ceph-node2
ssh-copy-id root@ceph-node3
# vim ~/.ssh/config
Host ceph-node1
Hostname ceph-node1
User root
Host ceph-node2
Hostname ceph-node2
User root
Host ceph-node3
Hostname ceph-node3
User root
磁盘分区
ceph-node1节点:
[root@ceph-node1 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 931.5G 0 disk
├─sda1 8:1 0 1G 0 part /boot
└─sda2 8:2 0 930.5G 0 part
├─centos-root 253:0 0 50G 0 lvm /
├─centos-swap 253:1 0 31.4G 0 lvm [SWAP]
└─centos-home 253:2 0 849.1G 0 lvm /home
sdb 8:16 0 931.5G 0 disk
sdc 8:32 0 931.5G 0 disk
sdd 8:48 0 931.5G 0 disk
sde 8:64 0 931.5G 0 disk
sdf 8:80 0 465.8G 0 disk
sr0 11:0 1 1024M 0 rom
[root@ceph-node1 ~]#
注意只需要针对SSD的日志盘进行分区!
wal 60G db 40G
# 对日志盘进行LVM分区
[root@ceph-node1 ~]# pvcreate /dev/sdf
WARNING: dos signature detected on /dev/sdf at offset 510. Wipe it? [y/n]: y
Wiping dos signature on /dev/sdf.
Physical volume "/dev/sdf" successfully created.
[root@ceph-node1 ~]# vgcreate ceph-pool /dev/sdf
Volume group "ceph-pool" successfully created
[root@ceph-node1 ~]# lvcreate -L 60G -n osd0.wal ceph-pool
WARNING: xfs signature detected on /dev/ceph-pool/osd0.wal at offset 0. Wipe it? [y/n]: y
Wiping xfs signature on /dev/ceph-pool/osd0.wal.
Logical volume "osd0.wal" created.
[root@ceph-node1 ~]# lvcreate -L 40G -n osd0.db ceph-pool
Logical volume "osd0.db" created.
lvcreate -L 60G -n osd1.wal ceph-pool
lvcreate -L 40G -n osd1.db ceph-pool
lvcreate -L 60G -n osd2.wal ceph-pool
lvcreate -L 40G -n osd2.db ceph-pool
lvcreate -L 60G -n osd3.wal ceph-pool
lvcreate -L 40G -n osd3.db ceph-pool
其他节点的分区操作,同理,注意好分清楚哪个盘
确保所有节点上的分区操作完成!
管理节点安装ceph-depoly工具
yum install ceph-deploy -y
安装后可能会执行报错:
[root@ceph-node1 ~]# ceph-deploy -v
Traceback (most recent call last):
File "/usr/bin/ceph-deploy", line 18, in <module>
from ceph_deploy.cli import main
File "/usr/lib/python2.7/site-packages/ceph_deploy/cli.py", line 1, in <module>
import pkg_resources
ImportError: No module named pkg_resources
原因是缺python-setuptools,安装它即可:
# yum install python-setuptools
# ceph-deploy --version
2.0.0
安装Ceph
yum install ceph -y
yum安装ceph过程中报错:
---> Package rdma-core.x86_64 0:22.4-2.el7_8 will be installed
--> Processing Dependency: rdma-core(x86-64) = 22.4-2.el7_8 for package: libibverbs-22.4-2.el7_8.x86_64
--> Processing Dependency: rdma-core(x86-64) = 22.4-2.el7_8 for package: librdmacm-22.4-2.el7_8.x86_64
--> Finished Dependency Resolution
Error: Package: libibverbs-22.4-2.el7_8.x86_64 (updates)
Requires: rdma-core(x86-64) = 22.4-2.el7_8
Available: rdma-core-22.4-1.el7.x86_64 (base)
rdma-core(x86-64) = 22.4-1.el7
Available: rdma-core-22.4-2.el7_8.i686 (updates)
~rdma-core(x86-32) = 22.4-2.el7_8
Error: Package: librdmacm-22.4-2.el7_8.x86_64 (updates)
Requires: rdma-core(x86-64) = 22.4-2.el7_8
Available: rdma-core-22.4-1.el7.x86_64 (base)
rdma-core(x86-64) = 22.4-1.el7
Available: rdma-core-22.4-2.el7_8.i686 (updates)
~rdma-core(x86-32) = 22.4-2.el7_8
You could try using --skip-broken to work around the problem
You could try running: rpm -Va --nofiles --nodigest
# 解决方案:
卸载旧依赖
rpm -e mlnx-ofa_kernel-5.0-OFED.5.0.1.0.0.0.1.g34c46d3.rhel7u5.x86_64 \
kmod-mlnx-ofa_kernel-5.0-OFED.5.0.1.0.0.0.1.g34c46d3.rhel7u5.x86_64
手动安装依赖包
[root@ceph-node1 ~]# ls
anaconda-ks.cfg libibverbs-22.4-2.el7_8.x86_64.rpm rdma-core-22.4-2.el7_8.x86_64.rpm
ceph-deploy-ceph.log librdmacm-22.4-2.el7_8.x86_64.rpm
[root@ceph-node1 ~]# rpm -ivh rdma-core-22.4-2.el7_8.x86_64.rpm
Preparing... ################################# [100%]
Updating / installing...
1:rdma-core-22.4-2.el7_8 ################################# [100%]
[root@ceph-node1 ~]# rpm -ivh libibverbs-22.4-2.el7_8.x86_64.rpm
Preparing... ################################# [100%]
Updating / installing...
1:libibverbs-22.4-2.el7_8 ################################# [100%]
[root@ceph-node1 ~]# rpm -ivh librdmacm-22.4-2.el7_8.x86_64.rpm
Preparing... ################################# [100%]
Updating / installing...
1:librdmacm-22.4-2.el7_8 ################################# [100%]
新建集群
创建集群管理目录
[root@ceph-node1 ~]# mkdir /cluster
[root@ceph-node1 ~]# cd /cluster/
[root@ceph-node1 cluster]# pwd
/cluster
vim ceph.conf
[global]
fsid = 5ef480cc-c7e4-472c-b260-c601dbe377f6
mon_initial_members = ceph-node1, ceph-node2, ceph-node3
mon_host = 183.60.201.186,183.60.201.180,183.60.201.181
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
public_network = 10.0.1.0/24
cluster_network = 10.0.0.0/24
mon_allow_pool_delete = true
mon_clock_drift_allowed = 3
mon_clock_drift_warn_backoff = 30
mon_pg_warn_max_per_osd = 1000
osd_pool_default_size = 3
osd_pool_default_min_size = 1
mon_osd_backfillfull_ratio = 0.75
mon_osd_full_ratio = .85
mon_osd_nearfull_ratio = .70
osd_failsafe_full_ratio = 0.97
osd_deep_scrub_randomize_ratio = 0.01
[mgr]
mgr modules = dashboard
[osd]
osd_max_write_size = 1024
osd_recovery_op_priority = 1
osd_recovery_max_active = 1
osd_recovery_max_single_start = 1
osd_recovery_max_chunk = 1048576
osd_recovery_threads = 1
osd_max_backfills = 1
osd_scrub_begin_hour = 22
osd_scrub_end_hour = 7
osd_recovery_sleep = 0
添加集群
[root@ceph-node1 cluster]# ceph-deploy new ceph-node1 ceph-node2 ceph-node3 --public_network=183.60.201.128/25 --cluster_network=10.0.0.0/24
部署mon 收集密钥
初始化mon
ceph-deploy mon create-initial
授权admin 密钥分发
ceph-deploy --overwrite-conf admin ceph-node1 ceph-node2 ceph-node3
部署mgr
ceph-deploy mgr create ceph-node1 ceph-node2 ceph-node3
部署mds
ceph-deploy mds create ceph-node1 ceph-node2 ceph-node3
创建cephfs
[root@ceph-node1 cluster]# ceph osd pool create cephfs_data 128 128
[root@ceph-node1 cluster]# ceph osd pool create cephfs_metadata 128 128
[root@ceph-node1 cluster]# ceph fs new cephfs cephfs_metadata cephfs_data
new fs with metadata pool 6 and data pool 5
[root@ceph-node1 cluster]# ceph fs ls
name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ]
[root@ceph-node1 cluster]# cat ceph.client.admin.keyring
[client.admin]
key = AQCaGvteVGIqHxAAqSTpgwwQuGqroyCTlZB3Eg==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"
mkdir -p /cephfs
mount -o name=admin,secret=AQCaGvteVGIqHxAAqSTpgwwQuGqroyCTlZB3Eg== -t ceph 183.60.201.181:6789:/ /cephfs/
部署rgw
ceph-deploy install --rgw ceph-node1 ceph-node2 ceph-node3
ceph-deploy rgw create ceph-node1 ceph-node2 ceph-node3
部署OSD
ceph-deploy osd create ceph-node1 --bluestore --block-wal ceph-pool/osd0.wal --block-db ceph-pool/osd0.db --data /dev/sdb
其他osd同理,注意谨慎操作,先核对好指定的硬盘
报错“error: GPT headers found, they must be removed on: /dev/sdc”,使用“# sgdisk —zap-all /dev/sdc”解决
yum install gdisk -y
sgdisk --zap-all /dev/sdc
启用dashboard
# 自 nautilus开始,dashboard作为一个单独的模块独立出来了,使用时需要在所有的mgr节点上单独安装
yum install -y ceph-mgr-dashboard
# 启用dashboard
ceph mgr module enable dashboard --force
# 默认启用SSL/TLS,所以需要创建自签名根证书
ceph dashboard create-self-signed-cert
# 指定修改
ceph config set mgr mgr/dashboard/server_addr 183.60.201.186
ceph config set mgr mgr/dashboard/server_port 8443
# 创建具有管理员角色的用户
ceph dashboard ac-user-create ceph ceph administrator
# 查看ceph-mgr服务
[root@ceph-node1 cluster]# ceph mgr services
{
"dashboard": "https://ceph-node1:8443/"
}
疑难杂症:
ceph health_warn:clock skew detected on mon解决
性能测试
Cephfs测试
[root@ceph-node1 cluster]# cat ceph.client.admin.keyring
[client.admin]
key = AQCaGvteVGIqHxAAqSTpgwwQuGqroyCTlZB3Eg==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"
mkdir -p /cephfs
mount -o name=admin,secret=AQCaGvteVGIqHxAAqSTpgwwQuGqroyCTlZB3Eg== -t ceph 183.60.201.181:6789:/ /cephfs/
[root@ceph-client cephfs]# time dd if=/dev/zero of=/mnt/cephfs/file bs=1G count=1 oflag=direct
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 141.464 s, 7.6 MB/s
real 2m21.563s
user 0m0.001s
sys 0m1.687s
Rados性能测试
# 写入测试
[root@ceph-node2 ~]# rados bench -p rbd 10 write --no-cleanup
# 顺序读测试
[root@ceph-node2 ~]# rados bench -p rbd 10 seq
# 随机读测试
[root@ceph-node2 ~]# rados bench -p rbd 10 rand
rbd性能测试
# rbd bench-write [pool/image]
--io-size:单位 byte,默认 4096 bytes = 4K
--io-threads:线程数,默认 16
--io-total:总写入字节,单位为字节,默认 1024M
--io-pattern <seq|rand>:写模式,默认为 seq 即顺序写
[root@ceph-node3 ~]# rbd bench-write rbd/bd1 --io-size 4096000 --io-total 10737418240
elapsed: 71 ops: 2622 ops/sec: 36.55 bytes/sec: 149703263.37
块大小4M, IOPS 37 , BW 143MB/s
[root@ceph-node3 ~]# rbd bench-write rbd/bd1 --io-size 4096 --io-total 10737418240
elapsed: 112 ops: 2621440 ops/sec: 23305.83 bytes/sec: 95460689.49
块大小4k, IOPS 23306 , BW 91MB/s
NFS测试
# 客户端挂载
[root@ceph-node1 cluster]# cat ceph.client.admin.keyring
[client.admin]
key = AQCaGvteVGIqHxAAqSTpgwwQuGqroyCTlZB3Eg==
caps mds = "allow *"
caps mgr = "allow *"
caps mon = "allow *"
caps osd = "allow *"
mkdir -p /cephfs
# ceph-fuse
ceph-fuse -m 183.60.201.181:6789 /cephfs/
# kernel
mount -o name=admin,secret=AQCaGvteVGIqHxAAqSTpgwwQuGqroyCTlZB3Eg== -t ceph 183.60.201.181:6789:/ /cephfs/
[root@ceph-client ~]# df -hT /mnt/cephfs/
Filesystem Type Size Used Avail Use% Mounted on
183.60.201.181:6789:/ ceph 3.3T 2.8G 3.3T 1% /mnt/cephfs
# NFS-server config
yum install nfs nfs-utils rpcbind -y
vim /etc/exports
/mnt/cephfs *(rw,async,no_root_squash,no_subtree_check)
exportfs -ar
systemctl restart rpcbind
systemctl restart nfs
[root@ceph-client ~]# showmount -e
Export list for ceph-client:
/mnt/cephfs *
集群维护
ceph-bluestore-tool -bluestore管理工具
移除OSD
ceph osd out $i
## remove osd from crush map
ceph osd crush remove osd.${i}
## delete osd authencation key
ceph auth del osd.${i}
## remove osd finally
ceph osd rm ${i}
更换osd
https://blog.csdn.net/qq_16327997/article/details/82968476
推荐博客
Proxmon VE对接外部ceph分布式存储
对接外部ceph分布式存储的cephfs文件系统
Proxmon VE对接外部ceph分布式存储的rbd块设备
该后端支持公共存储属性node, disable,content和以下rbd特定属性:
monhost
监视器守护程序IP列表。可选,仅当Ceph未在PVE集群上运行时才需要。池子
Ceph池名称。用户名
RBD用户ID。可选,仅当Ceph未在PVE集群上运行时才需要。krbd
通过krbd内核模块访问rbd。如果要将存储用于容器,这是必需的。
外部Ceph集群的配置示例(/etc/pve/storage.cfg)
rbd: ceph-external
monhost 10.0.0.10 10.0.0.20 10.0.0.30
pool ceph-external
content images
username admin
认证方式
如果使用cephx身份验证z,则需要将密钥文件从外部Ceph群集复制到Proxmox VE主机。
创建目录的/ etc / PVE /私法/ CEPH与
mkdir /etc/pve/priv/ceph
然后复制钥匙圈
scp <cephserver>:/etc/ceph/ceph.client.admin.keyring /etc/pve/priv/ceph/<STORAGE_ID>.keyring
密钥环必须命名为与您的
如果Ceph是本地安装在PVE群集上,则可以通过pveceph或在GUI中自动完成 。