title: ceph集群之RGW高可用 #标题tags: #标签
date: 2021-04-16
categories: 存储 # 分类
Ceph对象存储使用Ceph对象网关守护进程(radosgw),它是用于与Ceph存储群集进行交互的HTTP服务器。由于它提供与OpenStack Swift和Amazon S3兼容的接口,在 ceph存储之OSS对象存储 文章中,只是部署了一个RGW服务,来实现了对象存储的功能,但是在生产中,一个RGW意味着有单点故障的可能,所以这篇文章就写下来如何扩展RGW服务。
参考:官方文档
扩展RGW
在扩展RGW之前,需要先确保参考 ceph存储之OSS对象存储 文章安装了第一个RGW服务,如下:
当一切准备好后,我们需要清楚这个RGW是怎么实现高可用集群的,其实无非就是多安装几个RGW服务,然后加一层代理,对所有RGW服务做负载均衡,具体这个代理可以使用nginx、haproxy来做,如果要追求更高的可靠性,可以做两个代理,然后对这两个代理做一个VIP。至此,就可以实现RGW的高可用以及增加其处理能力。
大概示意图如下(从网上随便扒拉下来一个图):
新增第二个RGW节点
$ cd ~/my-cluster/
# 将 centos-20-5扩展为第二个RGW节点
$ ceph-deploy rgw create centos-20-5
.................... # 输出如下表示成功
[ceph_deploy.rgw][INFO ] The Ceph Object Gateway (RGW) is now running on host centos-20-5 and default port 7480
# 查看集群状态
$ ceph -s
cluster:
id: d94fee92-ef1a-4f1f-80a5-1c7e1caf4a4a
health: HEALTH_OK
services:
mon: 3 daemons, quorum centos-20-10,centos-20-5,centos-20-6 (age 18m)
mgr: centos-20-6(active, since 18m), standbys: centos-20-5, centos-20-10
mds: cephfs-demo:1 {0=centos-20-10=up:active} 2 up:standby
osd: 6 osds: 6 up (since 18m), 6 in (since 46h)
rgw: 2 daemons active (centos-20-10, centos-20-5)
# 可以看到上面有两个rgw的守护进程
task status:
scrub status:
mds.centos-20-10: idle
data:
pools: 9 pools, 288 pgs
objects: 249 objects, 14 MiB
usage: 6.1 GiB used, 114 GiB / 120 GiB avail
pgs: 288 active+clean
修改新增RGW节点的监听端口
1、修改配置文件
$ cd ~/my-cluster/
$ vim ceph.conf
# 增加如下配置
[client.rgw.centos-20-5] # 将 centos-20-5 替换为你新增rgw所在节点的主机名
rgw_frontends = "civetweb port=80"
2、替换配置文件
# 将后面三个替换为你ceph集群中的所有节点
$ ceph-deploy --overwrite-conf config push centos-20-10 centos-20-5 centos-20-6
3、重启新增rgw所在节点的radosgw服务
# 此操作在你新增rgw的主机节点上执行
$ systemctl restart ceph-radosgw.target
$ ss -lnpt | grep radosgw # 确认端口已修改
LISTEN 0 128 *:80 *:* users:(("radosgw",pid=20705,fd=45))
至此,客户端可以通过这两个的任意一个进行访问对象存储服务,但具体是通过哪个呢?这就需要我们增加负载均衡设备了,如果你是个运维老狗,其中缘由不用我多说,直接开干。
haproxy+keepalived构建RGW高可用集群
环境说明
节点名称 | IP地址 | 软件 | VIP+端口 | 后端RGW地址 |
---|---|---|---|---|
centos-10-2 | 192.168.20.2 | haproxy+keepalived | 192.168.20.100:80 | 192.168.20.5:80 192.168.20.10:80 |
centos-10-3 | 192.168.20.3 | haproxy+keepalived |
我这里机器配置比较高,所以直接使用两个新主机来做haproxy代理,如果你的机器配置不够,也可以复用RGW节点,但要注意端口冲突的问题,这个自行解决。
如果你更喜欢用nginx,可以尝试用nginx代替haproxy,这个我没验证过。
安装配置haproxy
下面除了安装haproxy以外的配置,先在其中一台上进行即可。
1、安装haproxy
# 这个需要在两台haproxy节点上都进行。
$ yum -y install haproxy
2、修改配置文件
$ vim /etc/haproxy/haproxy.cfg # 修改后的完整配置文件如下
global
log 127.0.0.1 local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
stats socket /var/lib/haproxy/stats
defaults
mode http
log global
option httplog
option dontlognull
option http-server-close
option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
# 定义监听地址为80,后端为rgw
frontend http_web *:80
mode http
default_backend rgw
# 定义后端rgw对应的服务列表
backend rgw
balance roundrobin
mode http
server node1 192.168.20.5:80
server node2 192.168.20.10:80
3、启动haproxy
# 加入开机自启并启动
$ systemctl start haproxy && systemctl enable haproxy
# 确定端口在监听
$ ss -lnpt | grep 80
LISTEN 0 3000 *:80 *:* users:(("haproxy",pid=7345,fd=5))
4、配置主机centos-10-3
的haproxy服务
接下来的操作在centos-10-3主机上进行。
# 拷贝第一台改好的配置文件
$ rsync -az 192.168.20.2:/etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg
# 加入开机自启并启动
$ systemctl start haproxy && systemctl enable haproxy
# 确定端口在监听
$ ss -lnpt | grep 80
LISTEN 0 3000 *:80 *:* users:(("haproxy",pid=7345,fd=5))
5、分别访问两个节点的haproxy服务,返回如下信息,则表示验证代理配置无误:
至此,haproxy就配置完成了。
安装配置keepalived
1、安装keepalived(两台机器都需要执行)
$ wget https://keepalived.org/software/keepalived-2.0.20.tar.gz
yum install -y gcc openssl-devel openssl libnl libnl-devel libnfnetlink-devel
tar zxf keepalived-2.0.20.tar.gz && cd keepalived-2.0.20
./configure --prefix=/opt/keepalived-2.0.20
make && make install
# 添加为系统服务并开机自启
$ mkdir /etc/keepalived
cp keepalived/etc/init.d/keepalived /etc/init.d/
cp keepalived/etc/sysconfig/keepalived /etc/sysconfig/
cp keepalived/etc/keepalived/keepalived.conf /etc/keepalived/
cd /etc/init.d/
chkconfig --add keepalived
systemctl enable keepalived
2、配置keepalived(挑选其中一台进行执行即可)
$ cat /etc/keepalived/keepalived.conf
global_defs {
script_user root
router_id centos-20-2 # 唯一id
}
vrrp_script chk_haproxy { # 定义监测脚本
script "/etc/keepalived/chk_haproxy.sh" # 指定实际的脚本位置
interval 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
unicast_src_ip 192.168.20.2 # 本机IP
unicast_peer {
192.168.20.3 # 参与keepalived的对端主机IP
}
virtual_router_id 23 # 虚拟ID,参与keepalived的必须一致
priority 100
nopreempt
advert_int 1
authentication { # 认证密码,必须和其他节点一致
auth_type PASS
auth_pass 1234
}
virtual_ipaddress {
192.168.20.100/24 # 定义VIP
}
track_script {
chk_haproxy # 调用上面定义的监测脚本
}
}
# 定义 /etc/keepalived/chk_haproxy.sh 监测脚本
$ cat /etc/keepalived/chk_haproxy.sh
#!/bin/bash
keepalived_log=/etc/keepalived/vip.log
haproxy_pid=$(ps -ef | grep '/usr/sbin/haproxy' | grep -v grep | wc -l) # 确保此处过滤出来的是你的进程,并且尽可能精准匹配你的进程
if [[ ${haproxy_pid} -eq 0 ]];then
cat >> ${keepalived_log} << EOF
haproxy stopped running at $(date '+%F %T')
Stopping keepalived ...
EOF
systemctl stop keepalived
fi
$ chmod +x /etc/keepalived/chk_haproxy.sh # 脚本需要有执行权限
2、配置第二台keepalived
# 拷贝第一台keepalived的配置文件
$ rsync -az 192.168.20.2:/etc/keepalived/keepalived.conf /etc/keepalived/
rsync -az 192.168.20.2:/etc/keepalived/chk_haproxy.sh /etc/keepalived/
# 修改冲突之处
$ cat /etc/keepalived/keepalived.conf
global_defs {
script_user root
router_id centos-20-3 # 修改id
}
vrrp_script chk_haproxy {
script "/etc/keepalived/chk_haproxy.sh"
interval 2
}
vrrp_instance VI_1 {
state BACKUP
interface ens33
unicast_src_ip 192.168.20.3 # 修改为本机IP
unicast_peer {
192.168.20.2 # 修改为对端IP
}
virtual_router_id 23
priority 100
nopreempt
advert_int 1
authentication {
auth_type PASS
auth_pass 1234
}
virtual_ipaddress {
192.168.20.100/24
}
track_script {
chk_haproxy
}
}
3、启动keepalived
两台机器都启动keepalived。
$ systemctl start keepalived
确认进程存在
$ ps -ef | grep keepalived | grep -v grep
root 31145 1 0 20:32 ? 00:00:00 /opt/keepalived-2.0.20/sbin/keepalived -D
root 31146 31145 0 20:32 ? 00:00:00 /opt/keepalived-2.0.20/sbin/keepalived -D
4、确认VIP已存在
注:VIP只能存在一台机器上,并且只能用ip 命令查看到VIP。
# 查看VIP(一般VIP会在先启动keepalived的那个机器上)
$ ip a
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether 00:0c:29:4e:b1:9a brd ff:ff:ff:ff:ff:ff
inet 192.168.20.3/24 brd 192.168.20.255 scope global noprefixroute ens33
valid_lft forever preferred_lft forever
inet 192.168.20.100/24 scope global secondary ens33
valid_lft forever preferred_lft forever
可以自行验证VIP的切换(正常来说,只要haproxy服务停止,VIP就会飘走到haproxy服务正常的节点上)。
4、访问VIP确认,返回如下信息,则表示配置成功:
使用s3客户端访问VIP进行验证
上面的VIP自动切换配置无误后,就差最后一步了,只要确保客户端访问VIP是可以的,那么就么得问题了。
我这里假装你已经参考 ceph存储之OSS对象存储 配置了s3cmd命令。
# 修改保存的s3客户端配置文件
$ vim ~/.s3cfg # 将下面两项配置指定为VIP
host_base = 192.168.20.100:80
host_bucket = 192.168.20.100:80/%(bucket)s
# 创建bucket和查询测试
$ s3cmd mb s3://s3_vip_test
Bucket 's3://s3_vip_test/' created
$ s3cmd ls
2021-04-16 08:04 s3://ceph-s3-bucket
2021-04-17 13:43 s3://s3_vip_test
2021-04-16 08:10 s3://s3cmd-demo
2021-04-16 08:14 s3://swift-demo
至此,高可用的RGW部署及测试完成。