date: 2020-12-27title: Redis 5.0.5或Redis 6.0.4集群部署、扩缩容 #标题
tags: redis集群 #标签
categories: redis # 分类
环境准备
OS | hostname | IP | role | port |
---|---|---|---|---|
Centos 7.7 | redis1 | 192.168.20.10 | master,slave | 7001,17001,7002,17002 |
Centos 7.7 | redis2 | 192.168.20.5 | master,slave | 7001,17001,7002,17002 |
Centos 7.7 | redis3 | 192.168.20.6 | master,slave | 7001,17001,7002,17002 |
说明:此集群采用三个主机,每个主机均运行两个redis实例,也就是说,三个主机上,共运行六个redis实例,构建为集群,集群采用三主三从模式,700x端口为client连接所用,1700x为集群中通信所用端口。
此处直接关闭了防火墙,若主机开启防火墙,请自行调整防火墙策略,放行相关流量。
接下来的操作,若无特别说明,在redis1主机上配置即可。
编译redis
由于redis6.0.x版本需要gcc版本最少要为5.3以上,而centos 7.x的gcc版本默认安装4.8.5,如下:
$ gcc -v # 查看gcc版本
......... # 省略部分内容
gcc 版本 4.8.5 20150623 (Red Hat 4.8.5-28) (GCC)
因此,redis5.x.x版本编译无需升级gcc,而如果要安装redis 6.x.x版本,则还需要升级gcc版本,这里,我将写下redis 6.x.x的编译过程。
不管你编译的是redis 5.x.x 还是redis 6.x.x,只有在编译时是否需要升级gcc本版本有区别,后续配置没有任何区别。
注: 下面的编译redis 5.0.5和编译redis 6.0.4二选一进行配置即可。
编译redis 5.0.5
$ yum -y install gcc*
$ wget http://download.redis.io/releases/redis-5.0.5.tar.gz
$ tar zxf redis-5.0.5.tar.gz;cd redis-5.0.5/
$ make # 只需编译,无需make install
编译redis 6.0.4
如果在make编译前不升级gcc版本到5.3以上的话,会报错如下:
啥也不说了,开整。
$ wget http://download.redis.io/releases/redis-6.0.4.tar.gz # 下载redis 6.0.4
$ tar zxf redis-6.0.4.tar.gz
$ cd redis-6.0.4/
# 升级gcc版本
$ yum -y install centos-release-scl
$ yum -y install devtoolset-9-gcc devtoolset-9-gcc-c++ devtoolset-9-binutils
# 进入临时gcc环境
$ scl enable devtoolset-9 bash
$ gcc -v # 查看gcc版本
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/opt/rh/devtoolset-9/root/usr/libexec/gcc/x86_64-redhat-linux/9/lto
Target: x86_64-redhat-linux
......... # 省略部分内容
gcc version 9.1.1 20190605 (Red Hat 9.1.1-2) (GCC)
# 编译
$ pwd
/root/redis-6.0.4
$ make # 同样只需编译,无需make install
注:scl指令只是临时的,退出shell或重新打开一个shell就会恢复原系统gcc版本。
若想永久生效(但个人觉得没必要),可以执行以下命令永久使用gcc 9,其中/opt/rh/devtoolset-9/是scl命令启用后gcc -v查看到的COLLECT_LTO_WRAPPER路径前缀。
$ echo 'source /opt/rh/devtoolset-9/enable' >> /etc/profile
$ source /etc/profile
好,至此,两种不同版本的redis编译,就完事了,我后续操作选择了编译的redis 5.0.x版本进行(已用redis 6.0.x版本进行过同样的群集配置,完全没问题,请大胆选择)。
准备工作目录及redis所需文件
$ pwd
/root/redis-5.0.5
# 创建工作目录
$ mkdir -p /apps/usr/redis/{bin,conf,data,logs}
$ mkdir /apps/usr/redis/data/{7001,7002}
# copy配置文件及指令
$ cp src/redis* /apps/usr/redis/bin/
$ cp redis.conf /apps/usr/redis/conf/ # 配置文件,其实不用copy,自己写就行
$ cd /apps/usr/redis/bin/
$ rm -f *.c *.h *.o # 删除无用文件
修改配置文件
$ cd /apps/usr/redis/conf/
$ vim redis_7001.conf # 实例7001配置文件如下
# 关于配置文件各项解释,可以自行搜索,或者看自带配置文件中的注释
port 7001
bind 0.0.0.0
daemonize yes
pidfile /var/run/redis_7001.pid
cluster-enabled yes
cluster-config-file nodes-7001.conf
cluster-node-timeout 25000
# rdb持久化相关配置
save 900 1
save 300 10
save 60 10000
dbfilename dump_7001.rdb
dir /apps/usr/redis/data/7001
stop-writes-on-bgsave-error no
rdbcompression yes
rdbchecksum yes
rdb-save-incremental-fsync yes
# aof持久化相关配置
appendonly yes
appendfilename "appendonly_7001.aof"
appendfsync everysec
no-appendfsync-on-rewrite no
auto-aof-rewrite-percentage 100
auto-aof-rewrite-min-size 64mb
aof-rewrite-incremental-fsync yes
loglevel notice
logfile /apps/usr/redis/logs/redis-7001.logs
maxclients 15000
maxmemory 20gb
maxmemory-policy volatile-lru
protected-mode no
# 主从复制相关配置
replica-serve-stale-data yes
replica-read-only yes
repl-diskless-sync no
repl-diskless-sync-delay 5
repl-disable-tcp-nodelay no
replica-priority 100
repl-backlog-size 10mb
repl-timeout 120
# 集群密码认证(集群中所有节点密码必须一致,可不开启)
masterauth "SDFAgPjgGLK!8"
requirepass "SDFAgPjgGLK!8"
# 准备7002实例的配置文件
$ cp redis_7001.conf redis_7002.conf
$ sed -i 's/7001/7002/g' redis_7002.conf # 修改冲突之处
# redis2和redis3创建相应目录
$ mkdir -p /apps/usr/ # redis2和redis3都需要有此目录
# 将redis目录发送到其余两个节点
$ for i in 5 6;do rsync -az /apps/usr/redis 192.168.20.${i}:/apps/usr/;done
设置命令路径的环境变量
所有节点上都需要执行。
$ echo 'export PATH=$PATH:/apps/usr/redis/bin/' >> /etc/profile
$ source /etc/profile
$ redis-server -v # 查看redis版本
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=37a632ba3989f893
至此,三个主机共6个实例的配置文件就准备好了,可以启动了。
启动redis_7001实例
$ redis-server /apps/usr/redis/conf/redis_7001.conf # 启动7001实例
$ ss -lnpt | grep '7001' # 确定端口在监听
LISTEN 0 128 192.168.20.10:17001 *:* users:(("redis-server",pid=6905,fd=9))
LISTEN 0 128 192.168.20.10:7001 *:* users:(("redis-server",pid=6905,fd=6))
$ cat /apps/usr/redis/logs/redis-7001.logs # 查看启动日志,发现有写warning事项
日志中的warning事项如下:
解决启动warning事项
以下操作需要在三台主机上都执行一下,以便解决启动warning事项。
也可以先启动一下,查看日志,根据日志的内容来确定有哪些warning事项,此举主要是为了避免之前有过相关配置,再次配置造成重复。
1)解决最大打开文件数问题
$ ulimit -n #查看当前值
1024
$ echo '* - nofile 65535' >> /etc/security/limits.conf
#修改后,重新登录即可生效,重新登录后再次查看当前值
$ ulimit -n
65535
2)解决TCP积压值过小问题
$ echo "net.core.somaxconn = 65535" > /etc/sysctl.d/redis.conf
$ sysctl -p /etc/sysctl.d/redis.conf #刷新使其生效
net.core.somaxconn = 1024
3)允许分配所有的物理内存
$ echo "vm.overcommit_memory = 1" >> /etc/sysctl.d/redis.conf
$ sysctl -p /etc/sysctl.d/redis.conf #刷新使其生效
net.core.somaxconn = 1024
vm.overcommit_memory = 1
4)解决内存透明大页警告warning问题
$ echo never > /sys/kernel/mm/transparent_hugepage/enabled
#上述指令只是当前生效,重启后就会失效,接下来改为永久生效
$ echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
$ chmod +x /etc/rc.d/rc.local
至此,reboot重启服务器也好,只是重启redis服务也好,都不会再报哪些warning问题了。
启动其他redis节点
$ pkill -9 redis-server # 停止刚刚启动的实例
# 然后每个节点执行如下指令,启动所有redis实例
$ redis-server /apps/usr/redis/conf/redis_7001.conf
$ redis-server /apps/usr/redis/conf/redis_7002.conf
# 确定端口在监听
$ ss -lnput | grep 700.
tcp LISTEN 0 128 *:17001 *:* users:(("redis-server",pid=21547,fd=9))
tcp LISTEN 0 128 *:17002 *:* users:(("redis-server",pid=21552,fd=9))
tcp LISTEN 0 128 *:7001 *:* users:(("redis-server",pid=21547,fd=6))
tcp LISTEN 0 128 *:7002 *:* users:(("redis-server",pid=21552,fd=6))
创建群集
取消集群密码
如果你的集群有密码,则需进行此步骤,若没有设置密码,则可忽略。
此脚本在后面恢复集群密码,以及对redis集群扩缩容,都会用到。
$ cat redis_setpass.sh
#!/usr/bin/env bash
# 定义参与集群的IP
IPS=(
192.168.20.5
192.168.20.6
192.168.20.10
)
# 定义集群密码
PASS='SDFAgPjgGLK!8'
# 定义每个节点的监听端口
PORTS=(
7001
7002
)
# delete password
del_pass() {
for ip in ${IPS[@]}
do
for port in ${PORTS[@]}
do
redis-cli -c -h $ip -p $port -a $PASS config set masterauth ""
redis-cli -c -h $ip -p $port -a $PASS config set requirepass ""
done
echo "$ip delete password"
done
}
add_pass() {
for ip in ${IPS[@]}
do
for port in ${PORTS[@]}
do
redis-cli -c -h $ip -p $port config set masterauth "$PASS"
redis-cli -c -h $ip -p $port config set requirepass "$PASS"
done
echo "$ip add password"
done
}
env=$1
if [[ ${env} == "del" ]];then
echo "del redis password"
del_pass
elif [[ ${env} == "add" ]];then
add_pass
else
echo "${env} not add || del "
echo ' exit ..'
exit
fi
# chmod +x redis_setpass.sh
# ./redis_setpass.sh del # 删除集群密码
创建集群
以下操作在任意一台主机上进行即可。
# 创建集群,需指定参与集群的实例,并指定replicas副本数为1
$ redis-cli --cluster create 192.168.20.5:7001 192.168.20.5:7002 192.168.20.6:7001 192.168.20.6:7002 192.168.20.10:7001 192.168.20.10:7002 --cluster-replicas 1
>>> Performing hash slots allocation on 6 nodes...
Master[0] -> Slots 0 - 5460
Master[1] -> Slots 5461 - 10922
Master[2] -> Slots 10923 - 16383
Adding replica 192.168.20.6:7002 to 192.168.20.5:7001
Adding replica 192.168.20.10:7002 to 192.168.20.6:7001
Adding replica 192.168.20.5:7002 to 192.168.20.10:7001
M: 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001
slots:[0-5460] (5461 slots) master
S: 6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002
replicates 25dc2b57f7587b96f5edb70f13669b412f7dd510
M: 9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001
slots:[5461-10922] (5462 slots) master
S: 0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002
replicates 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246
M: 25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001
slots:[10923-16383] (5461 slots) master
S: 0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002
replicates 9078a19def138244b8365ee352847d1a54c5385e
Can I set the above configuration? (type 'yes' to accept): yes
# 上述为确认各个实例的role,以及slots(哈希槽)的分配,如果没有问题,输入yes继续即可
# 在此处有一个需要注意的点,就是master和对应的slave最好不要在同一主机,
# 否则,一旦该主机宕机,则会造成集群不可用
集群创建成功,输出如下:
测试集群
$ redis-cli -c -p 7001 -h 192.168.20.10 # 连接集群需要加-c参数
192.168.20.10:7001> cluster nodes # 查看master及slave之间的对照关系
$ redis-cli --cluster check 192.168.20.10:7001 # 可以执行此命令,比较直观的查看集群主从关系
25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001@17001 myself,master - 0 1608474492000 5 connected 10923-16383
0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001@17001 master - 0 1608474491000 1 connected 0-5460
6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002@17002 slave 25dc2b57f7587b96f5edb70f13669b412f7dd510 0 1608474492000 5 connected
0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002@17002 slave 9078a19def138244b8365ee352847d1a54c5385e 0 1608474491849 3 connected
9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001@17001 master - 0 1608474490000 3 connected 5461-10922
0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002@17002 slave 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 0 1608474492857 1 connected
192.168.20.10:7001> cluster info # 查看集群状态
cluster_state:ok
cluster_slots_assigned:16384
cluster_slots_ok:16384
cluster_slots_pfail:0
cluster_slots_fail:0
cluster_known_nodes:6
cluster_size:3
cluster_current_epoch:6
cluster_my_epoch:5
cluster_stats_messages_ping_sent:104
cluster_stats_messages_pong_sent:93
cluster_stats_messages_meet_sent:1
cluster_stats_messages_sent:198
cluster_stats_messages_ping_received:89
cluster_stats_messages_pong_received:105
cluster_stats_messages_meet_received:4
cluster_stats_messages_received:198
# 查询key值
192.168.20.10:7001> set name ray # 设置key
-> Redirected to slot [5798] located at 192.168.20.5:7002 # 此key被存储到了20.3:7002这个实例
OK
# 可以发现下面的命令提示符都自动换成了20.3:7002这个实例的
192.168.20.5:7002> get name # 查询key值
"ray"
# 登录到其他节点,查看是否可以查询到刚刚的key值
$ redis-cli -c -p 7006 -h 192.168.20.6 # 登录到一个slave节点
192.168.20.6:7006> get name # 至此如果可以查看到,群集基本没有问题
-> Redirected to slot [5798] located at 192.168.20.5:7002
"ray"
至于其高可用,及master和salve的故障自动切换,自行测试即可。正常情况下,master宕机后,slave会自动顶上成为新的master。待master恢复后,会变成新master节点的slave。
恢复集群密码
# ./redis_setpass.sh add # 使用上面的脚本添加密码
重建redis集群
若需要重建redis集群,则需要删除rdb、aof这两种数据持久化文件(若没开启aof,则不会有aof持久化文件),以及群集配置文件,在此博文中,需要删除的文件如下:
$ tree /apps/usr/redis/data/ # 要删除的文件都在data目录下了
/apps/usr/redis/data/
├── 7001
│ ├── appendonly_7001.aof
│ ├── dump_7001.rdb
│ └── nodes-7001.conf
└── 7002
├── appendonly_7002.aof
├── dump_7002.rdb
└── nodes-7002.conf
# 简单一点,可以删除data下的所有目录及文件,然后再创建新的目录(7001/7002)即可。
# 删除数据文件后,再次执行下面的命令创建集群即可。
$ redis-cli --cluster create 192.168.20.5:7001 192.168.20.5:7002 192.168.20.6:7001 192.168.20.6:7002 192.168.20.10:7001 192.168.20.10:7002 --cluster-replicas 1
redis集群扩容
当现有redis集群无法满足业务需求,需要扩容的话,可以按照如下进行配置(redis较高版本和低版本的扩容方式不太一样,如果你的集群是 3.x.x的,或者redis-cli 命令不支持集群扩容操作,请参考我的51cto博文进行扩容操作)。
部署新的redis节点
先准备好要加入集群中的redis实例,我这里的IP为 192.168.20.4 ,上面启动了两个实例,分别是 7001 和 7002。
$ mkdir /apps/usr/ -p # 新机器创建应用目录
# 在现有的redis节点上(192.168.20.10),将redis目录拷贝至新机器上
$ rsync -az /apps/usr/redis 192.168.20.4:/apps/usr/
# 到新机器上,删除数据文件并调整系统参数后启动redis实例
$ rm -rf /apps/usr/redis/data/*
$ mkdir /apps/usr/redis/data/{7001,7002}
$ echo '* - nofile 65535' >> /etc/security/limits.conf
#修改后,重新登录即可生效,重新登录后再次查看当前值
$ ulimit -n
65535
$ cat >> /etc/sysctl.d/redis.conf << EOF
net.core.somaxconn = 65535
vm.overcommit_memory = 1
EOF
$ sysctl -p /etc/sysctl.d/redis.conf #刷新使其生效
$ echo never > /sys/kernel/mm/transparent_hugepage/enabled
$ echo 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' >> /etc/rc.local
$ chmod +x /etc/rc.d/rc.local
# 配置环境变量
$ echo 'export PATH=$PATH:/apps/usr/redis/bin/' >> /etc/profile
$ source /etc/profile
$ redis-server -v # 查看redis版本
Redis server v=5.0.5 sha=00000000:0 malloc=jemalloc-5.1.0 bits=64 build=37a632ba3989f893
启动redis实例
$ redis-server /apps/usr/redis/conf/redis_7001.conf
$ redis-server /apps/usr/redis/conf/redis_7002.conf
# 确定端口在监听
$ ss -lnput | grep 700.
tcp LISTEN 0 128 *:17001 *:* users:(("redis-server",pid=21547,fd=9))
tcp LISTEN 0 128 *:17002 *:* users:(("redis-server",pid=21552,fd=9))
tcp LISTEN 0 128 *:7001 *:* users:(("redis-server",pid=21547,fd=6))
tcp LISTEN 0 128 *:7002 *:* users:(("redis-server",pid=21552,fd=6))
取消redis集群密码
# 脚本中需增加刚刚新增的redis的IP及端口(自行更改)
$ ./redis_setpass.sh del # redis_setpass.sh 为上面的脚本
redis新节点加入集群
# 以下操作在任意一台可以连接到集群的节点上进行即可(192.168.20.10:7001 为集群中已存在的节点)
$ redis-cli --cluster add-node 192.168.20.4:7001 192.168.20.10:7001 # 添加 7001 实例
$ redis-cli --cluster add-node 192.168.20.4:7002 192.168.20.10:7001 # 添加 7002 实例
查看现有节点信息
$ redis-cli -h 192.168.20.10 -p 7001 -c cluster nodes
25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001@17001 myself,master - 0 1608476229000 5 connected 10923-16383
0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001@17001 master - 0 1608476229212 1 connected 0-5460
87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001@17001 master - 0 1608476228207 7 connected
6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002@17002 slave 25dc2b57f7587b96f5edb70f13669b412f7dd510 0 1608476231228 5 connected
0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002@17002 slave 9078a19def138244b8365ee352847d1a54c5385e 0 1608476226000 3 connected
9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001@17001 master - 0 1608476227000 3 connected 5461-10922
b70b29a28a8ab8eb2ea43fc2258a9d02778fea7f 192.168.20.4:7002@17002 master - 0 1608476228000 0 connected
0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002@17002 slave 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 0 1608476230220 1 connected
可以看到,192.168.20.4 的两个实例都已加入(都为master),但还没有分配slots。
给新加入的master分配solts
$ redis-cli --cluster reshard 192.168.20.10:7001
执行上述命令后,操作如下:
如果在下面选择 all 的话,请注意,你如果是计划扩容多个master到集群中,那么你每次需要指定的要分配的slots是不一样的。
假设集群现有master为 3个,那么增加第四个master时,要分配的slots为:4096,如果继续增加第五个master,那么此时要分配的slots就应该为:16384 / 5 = 3276
在上面询问要从哪些节点上分配solts出来给新节点时,有两个选项,一个是 all,另一个是done,all表示所有节点上都分出一点来给新节点,done则是你手动指定从哪些节点上分配出来,建议选择done,不会有什么坑,如下:
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 4096 # 分配多少个solts
What is the receiving node ID? 9078a19def138244b8365ee352847d1a54c5385e # 分配给谁
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
# 从哪些源节点进行分配
Source node #1: 25dc2b57f7587b96f5edb70f13669b412f7dd510
Source node #2: 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246
Source node #3: 87751467d1f8e23154a316b263219481719102c8
Source node #4: done # 指定完成后输入 done
# 确认信息
Ready to move 4096 slots.
Source nodes:
M: 25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001
slots:[12288-16383] (4096 slots) master
1 additional replica(s)
M: 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001
slots:[1365-5460] (4096 slots) master
1 additional replica(s)
M: 87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001
slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
Destination node:
M: 9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001
slots:[6827-10922] (4096 slots) master
1 additional replica(s)
.............. # 省略部分输出
给新加入的master分配slave
$ redis-cli -h 192.168.20.4 -p 7002 -c # 连接至新的slave实例
192.168.20.4:7002> cluster nodes # 查询节点信息(找到那个没有从节点的 master的ID)
0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002@17002 slave 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 0 1608477926031 1 connected
87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001@17001 master - 0 1608477926000 7 connected 0-1364 5461-6826 10923-12287
0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002@17002 slave 9078a19def138244b8365ee352847d1a54c5385e 0 1608477927040 3 connected
b70b29a28a8ab8eb2ea43fc2258a9d02778fea7f 192.168.20.4:7002@17002 myself,master - 0 1608477923000 0 connected
6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002@17002 slave 25dc2b57f7587b96f5edb70f13669b412f7dd510 0 1608477928047 5 connected
9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001@17001 master - 0 1608477925000 3 connected 6827-10922
25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001@17001 master - 0 1608477924008 5 connected 12288-16383
0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001@17001 master - 0 1608477925015 1 connected 1365-5460
# 注,由于我只增加了一台机器上的redis,主从都在一个机器上,所以只能这样
# 如果你要扩容多个节点,那么最好将slave和master分别放在不同机器上
192.168.20.4:7002> CLUSTER REPLICATE 87751467d1f8e23154a316b263219481719102c8 # 指定复制 新master
OK
192.168.20.4:7002> cluster nodes # 确认集群中的master和slave无异常
0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002@17002 slave 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 0 1608478078000 1 connected
87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001@17001 master - 0 1608478078301 7 connected 0-1364 5461-6826 10923-12287
0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002@17002 slave 9078a19def138244b8365ee352847d1a54c5385e 0 1608478077000 3 connected
b70b29a28a8ab8eb2ea43fc2258a9d02778fea7f 192.168.20.4:7002@17002 myself,slave 87751467d1f8e23154a316b263219481719102c8 0 1608478077000 0 connected
6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002@17002 slave 25dc2b57f7587b96f5edb70f13669b412f7dd510 0 1608478076279 5 connected
9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001@17001 master - 0 1608478076000 3 connected 6827-10922
25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001@17001 master - 0 1608478076000 5 connected 12288-16383
0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001@17001 master - 0 1608478079310 1 connected 1365-5460
集群恢复密码
$ ./redis_setpass.sh add # 还是执行上面的脚本
redis 集群缩容
假设我们要将上面新增的 192.168.20.4 的7001和7002 两个实例从集群中移除,那么要怎样操作呢?
移除slave
需要先移除slave。
$ redis-cli -c -h 192.168.20.10 -p 7001 cluster nodes # 查到你要移除的 slave的 节点ID
# 如我这里查到的slave 节点ID 为: b70b29a28a8ab8eb2ea43fc2258a9d02778fea7f
# 开始移除
$ redis-cli --cluster del-node 192.168.20.10:7001 b70b29a28a8ab8eb2ea43fc2258a9d02778fea7f
>>> Removing node b70b29a28a8ab8eb2ea43fc2258a9d02778fea7f from cluster 192.168.20.10:7001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
# 从输出信息可以看到已经移除成功了,并且要下线的redis实例已被停止。
$ redis-cli -c -h 192.168.20.10 -p 7001 cluster nodes # 查看节点信息进行确认
25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001@17001 myself,master - 0 1608478589000 5 connected 12288-16383
0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001@17001 master - 0 1608478592084 1 connected 1365-5460
87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001@17001 master - 0 1608478593090 7 connected 0-1364 5461-6826 10923-12287
6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002@17002 slave 25dc2b57f7587b96f5edb70f13669b412f7dd510 0 1608478590063 5 connected
0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002@17002 slave 9078a19def138244b8365ee352847d1a54c5385e 0 1608478590000 3 connected
9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001@17001 master - 0 1608478591000 3 connected 6827-10922
0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002@17002 slave 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 0 1608478592000 1 connected
移除master
我们尝试删除之前加入的主节点7001,这个步骤相对比较麻烦一些,因为主节点的里面是有分配了hash槽的,所以我们这里必须先把7001里的hash槽放入到其他的可用主节点中去,然后再进行移除节点操作,不然会出现数据丢失问题(最好将要下线的master的slots数量平均分配至其他master上,所以只能一次分配部分solts槽,有几个master就要分配几次):
$ redis-cli --cluster reshard 192.168.20.10:7001 # 重新分配slots
>>> Performing Cluster Check (using node 192.168.20.10:7001)
M: 25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001
slots:[12288-16383] (4096 slots) master
1 additional replica(s)
M: 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001
slots:[1365-5460] (4096 slots) master
1 additional replica(s)
M: 87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001
slots:[0-1364],[5461-6826],[10923-12287] (4096 slots) master
S: 6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002
slots: (0 slots) slave
replicates 25dc2b57f7587b96f5edb70f13669b412f7dd510
S: 0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002
slots: (0 slots) slave
replicates 9078a19def138244b8365ee352847d1a54c5385e
M: 9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001
slots:[6827-10922] (4096 slots) master
1 additional replica(s)
S: 0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002
slots: (0 slots) slave
replicates 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
How many slots do you want to move (from 1 to 16384)? 1365 #
What is the receiving node ID? 25dc2b57f7587b96f5edb70f13669b412f7dd510
Please enter all the source node IDs.
Type 'all' to use all the nodes as source nodes for the hash slots.
Type 'done' once you entered all the source nodes IDs.
Source node #1: 87751467d1f8e23154a316b263219481719102c8
Source node #2: done
............. # 省略部分输出
Do you want to proceed with the proposed reshard plan (yes/no)? yes
按照上述方式,多分配几次,直到将要下线的master的solts完全分配出去。
$ redis-cli -c -h 192.168.20.10 -p 7001 cluster nodes # 再次查看集群状态,确认要下线的master没有任何solts槽
25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001@17001 myself,master - 0 1608479658000 8 connected 0-1364 12288-16383
0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001@17001 master - 0 1608479667087 10 connected 1365-5460 6826 10923-12287
87751467d1f8e23154a316b263219481719102c8 192.168.20.4:7001@17001 master - 0 1608479665070 7 connected
6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002@17002 slave 25dc2b57f7587b96f5edb70f13669b412f7dd510 0 1608479666000 8 connected
0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002@17002 slave 9078a19def138244b8365ee352847d1a54c5385e 0 1608479666000 9 connected
9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001@17001 master - 0 1608479666079 9 connected 5461-6825 6827-10922
0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002@17002 slave 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 0 1608479668097 10 connected
# 移除节点
$ redis-cli --cluster del-node 192.168.20.10:7001 87751467d1f8e23154a316b263219481719102c8
>>> Removing node 87751467d1f8e23154a316b263219481719102c8 from cluster 192.168.20.10:7001
>>> Sending CLUSTER FORGET messages to the cluster...
>>> SHUTDOWN the node.
检查集群状态
$ redis-cli --cluster check 192.168.20.10:7001
192.168.20.10:7001 (25dc2b57...) -> 1 keys | 5461 slots | 1 slaves.
192.168.20.5:7001 (0bfafa75...) -> 0 keys | 5462 slots | 1 slaves.
192.168.20.6:7001 (9078a19d...) -> 1 keys | 5461 slots | 1 slaves.
[OK] 2 keys in 3 masters.
0.00 keys per slot on average.
>>> Performing Cluster Check (using node 192.168.20.10:7001)
M: 25dc2b57f7587b96f5edb70f13669b412f7dd510 192.168.20.10:7001
slots:[0-1364],[12288-16383] (5461 slots) master
1 additional replica(s)
M: 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246 192.168.20.5:7001
slots:[1365-5460],[6826],[10923-12287] (5462 slots) master
1 additional replica(s)
S: 6affa9bdd5f81bdfa145fff32e3f01a9427d9dbd 192.168.20.5:7002
slots: (0 slots) slave
replicates 25dc2b57f7587b96f5edb70f13669b412f7dd510
S: 0d322826670120e3a6071d6d50ea31c74fc818ec 192.168.20.10:7002
slots: (0 slots) slave
replicates 9078a19def138244b8365ee352847d1a54c5385e
M: 9078a19def138244b8365ee352847d1a54c5385e 192.168.20.6:7001
slots:[5461-6825],[6827-10922] (5461 slots) master
1 additional replica(s)
S: 0fce7eada914c10a058fd554170329bf820c4a40 192.168.20.6:7002
slots: (0 slots) slave
replicates 0bfafa75ea8bf6678e0cef0bdd26c0fa8a6f2246
[OK] All nodes agree about slots configuration.
>>> Check for open slots...
>>> Check slots coverage...
[OK] All 16384 slots covered.
集群恢复密码
$ ./redis_setpass.sh add # 还是执行上面的脚本