Mysql集群的部署方式有很多, 典型的方式如:
MMM
Master-Master replication Manager
google, 2015, 不维护
1 manager + n agent
1写n读
MHA
Master High Availability
facebook工程师
美团方案, 主推
InnoDB Cluster + LVS
官方, 2016年底
…
没有做复杂的评估和选型, 先尝试一下官方的再说~
InnoDB Cluster
借用官方的图:
可见以下组成:
MySql Router: 请求路由和负载均衡
- 应用服务器共同部署
MySql Shell: 管理和配置
管理至少3台实例
1 router
1 primary
1 secondary
AdminAPI
- dba变量
MySql Servers: 服务器集群
Group Replication
Primary Instance: 主, 读/写
Secondary Instance: 从, 只读
MySql Servers
模式:
沙盒模式
生产模式
部署需求:
组复制
基础设施
InnoDB 存储引擎
主键
ipv4网络
低延迟, 高带宽
实例配置
—log-bin=ushare-bin
—log-slave-updates
—binlog-format=row
—enforce-gtid-consistency=ON
—gtid-mode=ON
—master-info-repository=TABLE
—relay-log-info-repository=TABLE
—transaction-write-set- extraction=XXHASH64
—binlog-checksum=NONE
—slave-parallel-workers=N
—slave-preserve-commit-order=1
—slave-parallel-type=LOGICAL_CLOCK
Python 2.7
生产部署
0. 配置权限和访问端口
bind_address=0.0.0.0
grant ALL on . to ‘root’@’%’ identified by ‘root’ with grant option;
1. 配置hostname
保证每个mysql可以通过hostname被发现, 可以通过语句查询当前hostname
SELECT coalesce(@@report_host, @@hostname);
修改hosts文件, 添加对应的hostname和ip
2. 配置日志级别
设置详细的日志等级, 以便发现配置问题
mysqlsh --log-level=DEBUG3
设置AdminAPI输出级别
mysql-js> dba.verbose=2
3. 检查配置状态/配置实例
执行命令检查配置状态
mysql-js> dba.checkInstanceConfiguration('root@ushare-dev-mysql01:3306')
mysql-js> dba.configureLocalInstance('root@ushare-dev-mysql01:3306')
4. 创建集群
4.1 连接实例
mysql-js> shell.connect('root@ushare-dev-mysql01:3306')
4.2 创建集群
mysql-js> var cluster = dba.createCluster('ushareCluster')
4.2’ 获取现有集群
mysql-js> var cluster = dba.getCluster()
4.3 查看集群状态
mysql-js> cluster.status()
4.4 添加其它实例
mysql-js> cluster.addInstance('root@ushare-dev-mysql02:3306')
Mysql Router
安装
sudo yum install mysql-router -y
生成配置
sudo mysqlrouter --bootstrap root@ushare-dev-mysql01:3306 --user=root
自动生成的配置文件(/etc/mysqlrouter/mysqlrouter.conf)如下:
# File automatically generated during MySQL Router bootstrap
[DEFAULT]
name=system
user=root
keyring_path=/var/lib/mysqlrouter/keyring
master_key_path=/etc/mysqlrouter/mysqlrouter.key
connect_timeout=30
read_timeout=30
[logger]
level = INFO
[metadata_cache:ushareCluster]
router_id=2
bootstrap_server_addresses=mysql://ushare-dev-mysql01:3306,mysql://ushare-dev-mysql02:3306,mysql://ushare-dev-mysql03:3306
user=mysql_router2_tabpdv6ijm6z
metadata_cluster=ushareCluster
ttl=0.5
[routing:ushareCluster_default_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://ushareCluster/default?role=PRIMARY
routing_strategy=round-robin
protocol=classic
[routing:ushareCluster_default_ro]
bind_address=0.0.0.0
bind_port=6447
destinations=metadata-cache://ushareCluster/default?role=SECONDARY
routing_strategy=round-robin
protocol=classic
[routing:ushareCluster_default_x_rw]
bind_address=0.0.0.0
bind_port=64460
destinations=metadata-cache://ushareCluster/default?role=PRIMARY
routing_strategy=round-robin
protocol=x
[routing:ushareCluster_default_x_ro]
bind_address=0.0.0.0
bind_port=64470
destinations=metadata-cache://ushareCluster/default?role=SECONDARY
routing_strategy=round-robin
protocol=x
关键信息如下:
innodb cluster的集群机器:
- bootstrap_server_addresses=mysql://ushare-dev-mysql01:3306,mysql://ushare-dev-mysql02:3306,mysql://ushare-dev-mysql03:3306
普通协议
读写端口: 6446
只读端口: 6447
X协议(支持doc存储)
读写端口: 64460
只读端口: 64470
调度策略
- routing_strategy=round-robin
服务配置
修改配置文件, 更改执行User为root, Group为root
sudo vim /usr/lib/systemd/system/mysqlrouter.service
更新, 启动, 设为有效
sudo systemctl daemon-reload
sudo systemctl start mysqlrouter
sudo systemctl enable mysqlrouter
服务启动后, 可以直接通过端口访问数据库集群进行测试.
至此, mysql集群配置完毕
踩坑:
1 Table ‘performance_schema.session_variables’ doesn’t exist
mysql_upgrade -uroot -p
systemctl restart mysqld
2 Dba.createCluster: ERROR: Error starting cluster: ‘ushare-dev-mysql01:3306’ - Query failed. MySQL Error (3092): ClassicSession.query: The server is not configured properly to be an active member of the group. Please see more details
出错原因是没有关SELINUX
sudo vim /etc/selinux/config
- SELINUX=disabled
sudo reboot
3 Cluster.addInstance: Cannot add an instance with the same server UUID
出错原因是复制的镜像, UUID重复
sudo rm /var/lib/mysql/auto.cfg
sudo systemctl restart mysqld
4 WARNING: On instance membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the
警告原因, 手动设置的变量不会持久化到配置文件中
- sudo vim /etc/my.cnf
5 断开后, 无法rejoin, 执行 reset master 后, 加入后为RECOVERING状态, 一段时间后变为MISSING状态
原因很多, 排查和恢复的步骤如下:
my.cnf里设置, 开启error日志: log_error=/var/log/mysqld.log
重启后, 发现启动报错, 经分析, 这个是正常状态(cluster不存在), 可以暂时忽略:
[ERROR] Plugin group_replication reported: ‘The group name option is mandatory’
[ERROR] Plugin group_replication reported: ‘Unable to start Group Replication on boot’
rejoin超时, 看日志, 这个属于数据库比集群提交超前, 执行 reset master;
- [ERROR] Plugin group_replication reported: ‘Group contains 3 members which is greater than group_replication_auto_increment_increment value of 1. This can lead to an higher rate of transactional aborts.’
再次rejoin, 进入RECOVERING状态, 日志显示是数据库创建失败
- [ERROR] Slave SQL for channel ‘group_replication_recovery’: Error ‘Can’t create database ‘mysql_innodb_cluster_metadata’; database exists’ on query. Default database: ‘mysql_innodb_cluster_metadata’. Query: ‘CREATE DATABASE mysql_innodb_cluster_metadata’, Error_code: 1007
将 mysql_innodb_cluster_metadata drop掉, 再加入, 状态恢复到ONLINE
常用命令工具
mysqlsh
shell
- Undefined connect (ConnectionData connectionData, String password)
dba
Cluster createCluster (String name, Dictionary options)
Undefined deleteSandboxInstance (Integer port, Dictionary options
Instance deploySandboxInstance (Integer port, Dictionary options)
Undefined dropMetadataSchema (Dictionary options)
Cluster getCluster (String name)
Undefined killSandboxInstance (Integer port, Dictionary options)
Undefined resetSession (Session session)
Undefined startSandboxInstance (Integer port, Dictionary options)
Undefined stopSandboxInstance (Integer port, Dictionary options)
Undefined checkInstanceConfiguration (InstanceDef instance, Dictionary options)
Instance configureLocalInstance (InstanceDef instance, Dictionary options)
Undefined rebootClusterFromCompleteOutage (String clusterName, Dictionary options)
cluster
String getName ()
Undefined addInstance (InstanceDef instance, Dictionary options)
Undefined rejoinInstance (InstanceDef instance, Dictionary options)
Undefined removeInstance (InstanceDef instance, Dictionary options)
Dictionary checkInstanceState (InstanceDef instance, String password)
String describe ()
String status ()
Undefined dissolve (Dictionary options)
Undefined rescan ()
Undefined forceQuorumUsingPartitionOf (InstanceDef instance, String password)
Group Replication
要点:
使用场景
提供
- 高可用
不提供
failed over
负载均衡
路由
技术概述
模式
single primary: 单实例可写, 其它实例只读
- 出现故障, 选举产生新的primary
multi primary: 多实例可写
服务
故障发现
group成员关系维护
错误冗余
Single primary VS Multi primary
技术概述:
传统方式:
一主多从
异步复制(或者半同步复制)
无分片(所有实例保存全量数据)
异步复制
半同步复制
组复制方式(Group Replication):
多主
全同步提交
certification: 为了避免写冲突, 在提交前先做校验, 遵循 first commit wins 法则
无分片(所有实例保存全量数据)
组复制