Mysql集群的部署方式有很多, 典型的方式如:

  1. MMM

    1. Master-Master replication Manager

    2. google, 2015, 不维护

    3. 1 manager + n agent

    4. 1写n读

  2. MHA

    1. Master High Availability

    2. facebook工程师

    3. 美团方案, 主推

  3. InnoDB Cluster + LVS

    1. 官方, 2016年底

没有做复杂的评估和选型, 先尝试一下官方的再说~

InnoDB Cluster

借用官方的图:
Group Replication + InnoDB Cluster - 图1

可见以下组成:

  • MySql Router: 请求路由和负载均衡

    • 应用服务器共同部署
  • MySql Shell: 管理和配置

    • 管理至少3台实例

      • 1 router

      • 1 primary

      • 1 secondary

    • AdminAPI

      • dba变量
  • MySql Servers: 服务器集群

    • Group Replication

      • Primary Instance: 主, 读/写

      • Secondary Instance: 从, 只读

MySql Servers

模式:

  • 沙盒模式

  • 生产模式

部署需求:

  • 组复制

    • 基础设施

      • InnoDB 存储引擎

      • 主键

      • ipv4网络

      • 低延迟, 高带宽

    • 实例配置

      • —log-bin=ushare-bin

      • —log-slave-updates

      • —binlog-format=row

      • —enforce-gtid-consistency=ON

      • —gtid-mode=ON

      • —master-info-repository=TABLE

      • —relay-log-info-repository=TABLE

      • —transaction-write-set- extraction=XXHASH64

      • —binlog-checksum=NONE

      • —slave-parallel-workers=N

      • —slave-preserve-commit-order=1

      • —slave-parallel-type=LOGICAL_CLOCK

  • Python 2.7

生产部署

0. 配置权限和访问端口
  • bind_address=0.0.0.0

  • grant ALL on . to ‘root’@’%’ identified by ‘root’ with grant option;

1. 配置hostname

保证每个mysql可以通过hostname被发现, 可以通过语句查询当前hostname

  1. SELECT coalesce(@@report_host, @@hostname);

修改hosts文件, 添加对应的hostname和ip

2. 配置日志级别

设置详细的日志等级, 以便发现配置问题

  1. mysqlsh --log-level=DEBUG3

设置AdminAPI输出级别

  1. mysql-js> dba.verbose=2

3. 检查配置状态/配置实例

执行命令检查配置状态

mysql-js> dba.checkInstanceConfiguration('root@ushare-dev-mysql01:3306')
mysql-js> dba.configureLocalInstance('root@ushare-dev-mysql01:3306')

4. 创建集群

4.1 连接实例

mysql-js> shell.connect('root@ushare-dev-mysql01:3306')

4.2 创建集群

mysql-js> var cluster = dba.createCluster('ushareCluster')

4.2’ 获取现有集群

mysql-js> var cluster = dba.getCluster()

4.3 查看集群状态

mysql-js> cluster.status()

4.4 添加其它实例

mysql-js> cluster.addInstance('root@ushare-dev-mysql02:3306')

Mysql Router

启动后, 可以让mysql集群对应用透明.

安装

sudo yum install mysql-router -y

生成配置

sudo mysqlrouter --bootstrap root@ushare-dev-mysql01:3306 --user=root

自动生成的配置文件(/etc/mysqlrouter/mysqlrouter.conf)如下:

# File automatically generated during MySQL Router bootstrap
[DEFAULT]
name=system
user=root
keyring_path=/var/lib/mysqlrouter/keyring
master_key_path=/etc/mysqlrouter/mysqlrouter.key
connect_timeout=30
read_timeout=30

[logger]
level = INFO

[metadata_cache:ushareCluster]
router_id=2
bootstrap_server_addresses=mysql://ushare-dev-mysql01:3306,mysql://ushare-dev-mysql02:3306,mysql://ushare-dev-mysql03:3306
user=mysql_router2_tabpdv6ijm6z
metadata_cluster=ushareCluster
ttl=0.5

[routing:ushareCluster_default_rw]
bind_address=0.0.0.0
bind_port=6446
destinations=metadata-cache://ushareCluster/default?role=PRIMARY
routing_strategy=round-robin
protocol=classic

[routing:ushareCluster_default_ro]
bind_address=0.0.0.0
bind_port=6447
destinations=metadata-cache://ushareCluster/default?role=SECONDARY
routing_strategy=round-robin
protocol=classic

[routing:ushareCluster_default_x_rw]
bind_address=0.0.0.0
bind_port=64460
destinations=metadata-cache://ushareCluster/default?role=PRIMARY
routing_strategy=round-robin
protocol=x

[routing:ushareCluster_default_x_ro]
bind_address=0.0.0.0
bind_port=64470
destinations=metadata-cache://ushareCluster/default?role=SECONDARY
routing_strategy=round-robin
protocol=x

关键信息如下:

  • innodb cluster的集群机器:

    • bootstrap_server_addresses=mysql://ushare-dev-mysql01:3306,mysql://ushare-dev-mysql02:3306,mysql://ushare-dev-mysql03:3306
  • 普通协议

    • 读写端口: 6446

    • 只读端口: 6447

  • X协议(支持doc存储)

    • 读写端口: 64460

    • 只读端口: 64470

  • 调度策略

    • routing_strategy=round-robin

服务配置

修改配置文件, 更改执行User为root, Group为root

sudo vim /usr/lib/systemd/system/mysqlrouter.service

更新, 启动, 设为有效

sudo systemctl daemon-reload
sudo systemctl start mysqlrouter
sudo systemctl enable mysqlrouter

服务启动后, 可以直接通过端口访问数据库集群进行测试.

至此, mysql集群配置完毕

踩坑:

1 Table ‘performance_schema.session_variables’ doesn’t exist

  • mysql_upgrade -uroot -p

  • systemctl restart mysqld

2 Dba.createCluster: ERROR: Error starting cluster: ‘ushare-dev-mysql01:3306’ - Query failed. MySQL Error (3092): ClassicSession.query: The server is not configured properly to be an active member of the group. Please see more details
出错原因是没有关SELINUX

  • sudo vim /etc/selinux/config

    • SELINUX=disabled
  • sudo reboot

3 Cluster.addInstance: Cannot add an instance with the same server UUID
出错原因是复制的镜像, UUID重复

  • sudo rm /var/lib/mysql/auto.cfg

  • sudo systemctl restart mysqld

4 WARNING: On instance membership change cannot be persisted since MySQL version 5.7.23 does not support the SET PERSIST command (MySQL version >= 8.0.11 required). Please use the .configureLocalInstance command locally to persist the changes.
警告原因, 手动设置的变量不会持久化到配置文件中

  • sudo vim /etc/my.cnf

5 断开后, 无法rejoin, 执行 reset master 后, 加入后为RECOVERING状态, 一段时间后变为MISSING状态
原因很多, 排查和恢复的步骤如下:

  • my.cnf里设置, 开启error日志: log_error=/var/log/mysqld.log

  • 重启后, 发现启动报错, 经分析, 这个是正常状态(cluster不存在), 可以暂时忽略:

    • [ERROR] Plugin group_replication reported: ‘The group name option is mandatory’

    • [ERROR] Plugin group_replication reported: ‘Unable to start Group Replication on boot’

  • rejoin超时, 看日志, 这个属于数据库比集群提交超前, 执行 reset master;

    • [ERROR] Plugin group_replication reported: ‘Group contains 3 members which is greater than group_replication_auto_increment_increment value of 1. This can lead to an higher rate of transactional aborts.’
  • 再次rejoin, 进入RECOVERING状态, 日志显示是数据库创建失败

    • [ERROR] Slave SQL for channel ‘group_replication_recovery’: Error ‘Can’t create database ‘mysql_innodb_cluster_metadata’; database exists’ on query. Default database: ‘mysql_innodb_cluster_metadata’. Query: ‘CREATE DATABASE mysql_innodb_cluster_metadata’, Error_code: 1007
  • 将 mysql_innodb_cluster_metadata drop掉, 再加入, 状态恢复到ONLINE

常用命令工具

  • mysqlsh

    • shell

      • Undefined connect (ConnectionData connectionData, String password)
    • dba

      • Cluster createCluster (String name, Dictionary options)

      • Undefined deleteSandboxInstance (Integer port, Dictionary options

      • Instance deploySandboxInstance (Integer port, Dictionary options)

      • Undefined dropMetadataSchema (Dictionary options)

      • Cluster getCluster (String name)

      • Undefined killSandboxInstance (Integer port, Dictionary options)

      • Undefined resetSession (Session session)

      • Undefined startSandboxInstance (Integer port, Dictionary options)

      • Undefined stopSandboxInstance (Integer port, Dictionary options)

      • Undefined checkInstanceConfiguration (InstanceDef instance, Dictionary options)

      • Instance configureLocalInstance (InstanceDef instance, Dictionary options)

      • Undefined rebootClusterFromCompleteOutage (String clusterName, Dictionary options)

    • cluster

      • String getName ()

      • Undefined addInstance (InstanceDef instance, Dictionary options)

      • Undefined rejoinInstance (InstanceDef instance, Dictionary options)

      • Undefined removeInstance (InstanceDef instance, Dictionary options)

      • Dictionary checkInstanceState (InstanceDef instance, String password)

      • String describe ()

      • String status ()

      • Undefined dissolve (Dictionary options)

      • Undefined rescan ()

      • Undefined forceQuorumUsingPartitionOf (InstanceDef instance, String password)

Group Replication

要点:

  • 使用场景

    • 提供

      • 高可用
    • 不提供

      • failed over

      • 负载均衡

      • 路由

  • 技术概述

    • 模式

      • single primary: 单实例可写, 其它实例只读

        • 出现故障, 选举产生新的primary
      • multi primary: 多实例可写

    • 服务

      • 故障发现

      • group成员关系维护

      • 错误冗余

Group Replication + InnoDB Cluster - 图2

Single primary VS Multi primary

Group Replication + InnoDB Cluster - 图3

技术概述:

传统方式:

  • 一主多从

  • 异步复制(或者半同步复制)

  • 无分片(所有实例保存全量数据)

Group Replication + InnoDB Cluster - 图4
异步复制
Group Replication + InnoDB Cluster - 图5
半同步复制

组复制方式(Group Replication):

  • 多主

  • 全同步提交

  • certification: 为了避免写冲突, 在提交前先做校验, 遵循 first commit wins 法则

  • 无分片(所有实例保存全量数据)

Group Replication + InnoDB Cluster - 图6
组复制