MHA搭建手册

类别 版本
OS CentOS Linux release 7.1.1503 (Core)
MySQL 5.7.21
GTID 开启
MHA 0.56

一、搭建主从

这里不多说了,比较简单。

二、配置ssh信任

在slave节点上 ssh-keygen生成key,一路回车就可以了。

  1. cd ~/.ssh/
  2. cat id_rsa.pub > authorized_keys
  3. chmod 600 *
  4. cd ~
  5. scp -r .ssh 192.168.50.3:~/

三、安装依赖

添加eperl源

  1. wget -O /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

安装依赖

  1. yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch
  2. yum install perl-Parallel-ForkManager

四、两个节点安装node、manager

  1. rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
  2. rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

五、MySQL初始化

  1. grant all privileges on *.* to 'root'@'%' identified by '123456';
  2. grant replication slave on *.* ro 'repl'@'%' identified by '123456';
  3. flush privileges;

六、修改配置文件

相关文件拷贝到/etc/masterha下

  1. cp -r masterha /etc/

全局级配置文件:

  1. vim /etc/masterha/masterha_default.conf
  1. [server default]
  2. #MySQL的用户和密码
  3. user=mha
  4. password=123456
  5. #系统ssh用户
  6. ssh_user=root
  7. #复制用户
  8. repl_user=repl
  9. repl_password= 123456
  10. #监控
  11. ping_interval=1
  12. #shutdown_script=""
  13. #切换调用的脚本
  14. master_ip_failover_script= /etc/masterha/master_ip_failover
  15. master_ip_online_change_script= /etc/masterha/master_ip_online_change

说明:shutdown_script主要用来设置master进行切换时,要执行的脚本动作,这个动作可以设置吧机器关了来防止脑裂,也可以做一些其他动作(前提是那台机器还活着)
下面贴一下两个切换用到的脚本:
master_ip_failover

  1. #!/usr/bin/env perl
  2. # Copyright (C) 2011 DeNA Co.,Ltd.
  3. #
  4. # This program is free software; you can redistribute it and/or modify
  5. # it under the terms of the GNU General Public License as published by
  6. # the Free Software Foundation; either version 2 of the License, or
  7. # (at your option) any later version.
  8. #
  9. # This program is distributed in the hope that it will be useful,
  10. # but WITHOUT ANY WARRANTY; without even the implied warranty of
  11. # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  12. # GNU General Public License for more details.
  13. #
  14. # You should have received a copy of the GNU General Public License
  15. # along with this program; if not, write to the Free Software
  16. # Foundation, Inc.,
  17. # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  18. ## Note: This is a sample script and is not complete. Modify the script based on your environment.
  19. use strict;
  20. use warnings FATAL => 'all';
  21. use Getopt::Long;
  22. use MHA::DBHelper;
  23. #自定义该组机器的vip
  24. my $vip = "192.168.1.100";
  25. my $if = "eth0";
  26. my (
  27. $command, $ssh_user, $orig_master_host,
  28. $orig_master_ip, $orig_master_port, $new_master_host,
  29. $new_master_ip, $new_master_port, $new_master_user,
  30. $new_master_password
  31. );
  32. GetOptions(
  33. 'command=s' => \$command,
  34. 'ssh_user=s' => \$ssh_user,
  35. 'orig_master_host=s' => \$orig_master_host,
  36. 'orig_master_ip=s' => \$orig_master_ip,
  37. 'orig_master_port=i' => \$orig_master_port,
  38. 'new_master_host=s' => \$new_master_host,
  39. 'new_master_ip=s' => \$new_master_ip,
  40. 'new_master_port=i' => \$new_master_port,
  41. 'new_master_user=s' => \$new_master_user,
  42. 'new_master_password=s' => \$new_master_password,
  43. );
  44. sub add_vip {
  45. my $output1 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/32 dev $if`;
  46. my $output2 = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/32 dev $if`;
  47. }
  48. exit &main();
  49. sub main {
  50. if ( $command eq "stop" || $command eq "stopssh" ) {
  51. # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
  52. # If you manage master ip address at global catalog database,
  53. # invalidate orig_master_ip here.
  54. my $exit_code = 1;
  55. eval {
  56. # updating global catalog, etc
  57. $exit_code = 0;
  58. };
  59. if ($@) {
  60. warn "Got Error: $@\n";
  61. exit $exit_code;
  62. }
  63. exit $exit_code;
  64. }
  65. elsif ( $command eq "start" ) {
  66. # all arguments are passed.
  67. # If you manage master ip address at global catalog database,
  68. # activate new_master_ip here.
  69. # You can also grant write access (create user, set read_only=0, etc) here.
  70. my $exit_code = 10;
  71. eval {
  72. my $new_master_handler = new MHA::DBHelper();
  73. # args: hostname, port, user, password, raise_error_or_not
  74. $new_master_handler->connect( $new_master_ip, $new_master_port,
  75. $new_master_user, $new_master_password, 1 );
  76. ## Set read_only=0 on the new master
  77. $new_master_handler->disable_log_bin_local();
  78. print "Set read_only=0 on the new master.\n";
  79. $new_master_handler->disable_read_only();
  80. ## Creating an app user on the new master
  81. #print "Creating app user on the new master..\n";
  82. #FIXME_xxx_create_user( $new_master_handler->{dbh} );
  83. $new_master_handler->enable_log_bin_local();
  84. $new_master_handler->disconnect();
  85. ## Update master ip on the catalog database, etc
  86. &add_vip();
  87. $exit_code = 0;
  88. };
  89. if ($@) {
  90. warn $@;
  91. # If you want to continue failover, exit 10.
  92. exit $exit_code;
  93. }
  94. exit $exit_code;
  95. }
  96. elsif ( $command eq "status" ) {
  97. # do nothing
  98. exit 0;
  99. }
  100. else {
  101. &usage();
  102. exit 1;
  103. }
  104. }
  105. sub usage {
  106. print
  107. "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  108. }

master_ip_online_change:

  1. #!/usr/bin/env perl
  2. # Copyright (C) 2011 DeNA Co.,Ltd.
  3. #
  4. # This program is free software; you can redistribute it and/or modify
  5. # it under the terms of the GNU General Public License as published by
  6. # the Free Software Foundation; either version 2 of the License, or
  7. # (at your option) any later version.
  8. #
  9. # This program is distributed in the hope that it will be useful,
  10. # but WITHOUT ANY WARRANTY; without even the implied warranty of
  11. # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
  12. # GNU General Public License for more details.
  13. #
  14. # You should have received a copy of the GNU General Public License
  15. # along with this program; if not, write to the Free Software
  16. # Foundation, Inc.,
  17. # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  18. ## Note: This is a sample script and is not complete. Modify the script based on your environment.
  19. use strict;
  20. use warnings FATAL => 'all';
  21. use Getopt::Long;
  22. use MHA::DBHelper;
  23. use MHA::NodeUtil;
  24. use Time::HiRes qw( sleep gettimeofday tv_interval );
  25. use Data::Dumper;
  26. my $_tstart;
  27. my $_running_interval = 0.1;
  28. #添加vip定义
  29. my $vip = "192.168.1.100";
  30. my $if = "eth0";
  31. my (
  32. $command, $orig_master_is_new_slave, $orig_master_host,
  33. $orig_master_ip, $orig_master_port, $orig_master_user,
  34. $orig_master_password, $orig_master_ssh_user, $new_master_host,
  35. $new_master_ip, $new_master_port, $new_master_user,
  36. $new_master_password, $new_master_ssh_user,
  37. );
  38. GetOptions(
  39. 'command=s' => \$command,
  40. 'orig_master_is_new_slave' => \$orig_master_is_new_slave,
  41. 'orig_master_host=s' => \$orig_master_host,
  42. 'orig_master_ip=s' => \$orig_master_ip,
  43. 'orig_master_port=i' => \$orig_master_port,
  44. 'orig_master_user=s' => \$orig_master_user,
  45. 'orig_master_password=s' => \$orig_master_password,
  46. 'orig_master_ssh_user=s' => \$orig_master_ssh_user,
  47. 'new_master_host=s' => \$new_master_host,
  48. 'new_master_ip=s' => \$new_master_ip,
  49. 'new_master_port=i' => \$new_master_port,
  50. 'new_master_user=s' => \$new_master_user,
  51. 'new_master_password=s' => \$new_master_password,
  52. 'new_master_ssh_user=s' => \$new_master_ssh_user,
  53. );
  54. exit &main();
  55. sub drop_vip {
  56. my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $orig_master_host /sbin/ip addr del $vip/32 dev $if`;
  57. #mysql里的连接全部干掉
  58. #FIXME
  59. }
  60. sub add_vip {
  61. my $output = `ssh -o ConnectTimeout=15 -o ConnectionAttempts=3 $new_master_host /sbin/ip addr add $vip/32 dev $if`;
  62. }
  63. sub current_time_us {
  64. my ( $sec, $microsec ) = gettimeofday();
  65. my $curdate = localtime($sec);
  66. return $curdate . " " . sprintf( "%06d", $microsec );
  67. }
  68. sub sleep_until {
  69. my $elapsed = tv_interval($_tstart);
  70. if ( $_running_interval > $elapsed ) {
  71. sleep( $_running_interval - $elapsed );
  72. }
  73. }
  74. sub get_threads_util {
  75. my $dbh = shift;
  76. my $my_connection_id = shift;
  77. my $running_time_threshold = shift;
  78. my $type = shift;
  79. $running_time_threshold = 0 unless ($running_time_threshold);
  80. $type = 0 unless ($type);
  81. my @threads;
  82. my $sth = $dbh->prepare("SHOW PROCESSLIST");
  83. $sth->execute();
  84. while ( my $ref = $sth->fetchrow_hashref() ) {
  85. my $id = $ref->{Id};
  86. my $user = $ref->{User};
  87. my $host = $ref->{Host};
  88. my $command = $ref->{Command};
  89. my $state = $ref->{State};
  90. my $query_time = $ref->{Time};
  91. my $info = $ref->{Info};
  92. $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);
  93. next if ( $my_connection_id == $id );
  94. next if ( defined($query_time) && $query_time < $running_time_threshold );
  95. next if ( defined($command) && $command eq "Binlog Dump" );
  96. next if ( defined($user) && $user eq "system user" );
  97. next
  98. if ( defined($command)
  99. && $command eq "Sleep"
  100. && defined($query_time)
  101. && $query_time >= 1 );
  102. if ( $type >= 1 ) {
  103. next if ( defined($command) && $command eq "Sleep" );
  104. next if ( defined($command) && $command eq "Connect" );
  105. }
  106. if ( $type >= 2 ) {
  107. next if ( defined($info) && $info =~ m/^select/i );
  108. next if ( defined($info) && $info =~ m/^show/i );
  109. }
  110. push @threads, $ref;
  111. }
  112. return @threads;
  113. }
  114. sub main {
  115. if ( $command eq "stop" ) {
  116. ## Gracefully killing connections on the current master
  117. # 1. Set read_only= 1 on the new master
  118. # 2. DROP USER so that no app user can establish new connections
  119. # 3. Set read_only= 1 on the current master
  120. # 4. Kill current queries
  121. # * Any database access failure will result in script die.
  122. my $exit_code = 1;
  123. eval {
  124. ## Setting read_only=1 on the new master (to avoid accident)
  125. my $new_master_handler = new MHA::DBHelper();
  126. # args: hostname, port, user, password, raise_error(die_on_error)_or_not
  127. $new_master_handler->connect( $new_master_ip, $new_master_port,
  128. $new_master_user, $new_master_password, 1 );
  129. print current_time_us() . " Set read_only on the new master.. ";
  130. $new_master_handler->enable_read_only();
  131. if ( $new_master_handler->is_read_only() ) {
  132. print "ok.\n";
  133. }
  134. else {
  135. die "Failed!\n";
  136. }
  137. $new_master_handler->disconnect();
  138. # Connecting to the orig master, die if any database error happens
  139. my $orig_master_handler = new MHA::DBHelper();
  140. $orig_master_handler->connect( $orig_master_ip, $orig_master_port,
  141. $orig_master_user, $orig_master_password, 1 );
  142. ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand
  143. $orig_master_handler->disable_log_bin_local();
  144. # print current_time_us() . " Drpping app user on the orig master..\n";
  145. print current_time_us() . " drop vip $vip..\n";
  146. #drop_app_user($orig_master_handler);
  147. &drop_vip();
  148. ## Waiting for N * 100 milliseconds so that current connections can exit
  149. my $time_until_read_only = 15;
  150. $_tstart = [gettimeofday];
  151. my @threads = get_threads_util( $orig_master_handler->{dbh},
  152. $orig_master_handler->{connection_id} );
  153. while ( $time_until_read_only > 0 && $#threads >= 0 ) {
  154. if ( $time_until_read_only % 5 == 0 ) {
  155. printf
  156. "%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",
  157. current_time_us(), $#threads + 1, $time_until_read_only * 100;
  158. if ( $#threads < 5 ) {
  159. print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
  160. foreach (@threads);
  161. }
  162. }
  163. sleep_until();
  164. $_tstart = [gettimeofday];
  165. $time_until_read_only--;
  166. @threads = get_threads_util( $orig_master_handler->{dbh},
  167. $orig_master_handler->{connection_id} );
  168. }
  169. ## Setting read_only=1 on the current master so that nobody(except SUPER) can write
  170. print current_time_us() . " Set read_only=1 on the orig master.. ";
  171. $orig_master_handler->enable_read_only();
  172. if ( $orig_master_handler->is_read_only() ) {
  173. print "ok.\n";
  174. }
  175. else {
  176. die "Failed!\n";
  177. }
  178. ## Waiting for M * 100 milliseconds so that current update queries can complete
  179. my $time_until_kill_threads = 5;
  180. @threads = get_threads_util( $orig_master_handler->{dbh},
  181. $orig_master_handler->{connection_id} );
  182. while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
  183. if ( $time_until_kill_threads % 5 == 0 ) {
  184. printf
  185. "%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",
  186. current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
  187. if ( $#threads < 5 ) {
  188. print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
  189. foreach (@threads);
  190. }
  191. }
  192. sleep_until();
  193. $_tstart = [gettimeofday];
  194. $time_until_kill_threads--;
  195. @threads = get_threads_util( $orig_master_handler->{dbh},
  196. $orig_master_handler->{connection_id} );
  197. }
  198. ## Terminating all threads
  199. print current_time_us() . " Killing all application threads..\n";
  200. $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
  201. print current_time_us() . " done.\n";
  202. $orig_master_handler->enable_log_bin_local();
  203. $orig_master_handler->disconnect();
  204. ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
  205. $exit_code = 0;
  206. };
  207. if ($@) {
  208. warn "Got Error: $@\n";
  209. exit $exit_code;
  210. }
  211. exit $exit_code;
  212. }
  213. elsif ( $command eq "start" ) {
  214. ## Activating master ip on the new master
  215. # 1. Create app user with write privileges
  216. # 2. Moving backup script if needed
  217. # 3. Register new master's ip to the catalog database
  218. # We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
  219. # If exit code is 0 or 10, MHA does not abort
  220. my $exit_code = 10;
  221. eval {
  222. my $new_master_handler = new MHA::DBHelper();
  223. # args: hostname, port, user, password, raise_error_or_not
  224. $new_master_handler->connect( $new_master_ip, $new_master_port,
  225. $new_master_user, $new_master_password, 1 );
  226. ## Set read_only=0 on the new master
  227. $new_master_handler->disable_log_bin_local();
  228. print current_time_us() . " Set read_only=0 on the new master.\n";
  229. $new_master_handler->disable_read_only();
  230. ## Creating an app user on the new master
  231. #print current_time_us() . " Creating app user on the new master..\n";
  232. print current_time_us() . "Add vip $vip on $if..\n";
  233. # create_app_user($new_master_handler);
  234. &add_vip();
  235. $new_master_handler->enable_log_bin_local();
  236. $new_master_handler->disconnect();
  237. ## Update master ip on the catalog database, etc
  238. $exit_code = 0;
  239. };
  240. if ($@) {
  241. warn "Got Error: $@\n";
  242. exit $exit_code;
  243. }
  244. exit $exit_code;
  245. }
  246. elsif ( $command eq "status" ) {
  247. # do nothing
  248. exit 0;
  249. }
  250. else {
  251. &usage();
  252. exit 1;
  253. }
  254. }
  255. sub usage {
  256. print
  257. "Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  258. die;
  259. }
  1. vim /etc/masterha/app1.conf
  1. [server default]
  2. #mha manager工作目录
  3. manager_workdir = /var/log/masterha/app1
  4. manager_log = /var/log/masterha/app1/app1.log
  5. remote_workdir = /var/log/masterha/app1
  6. [server1]
  7. hostname=192.168.50.3
  8. master_binlog_dir = /storage/mysql/logs
  9. candidate_master = 1
  10. check_repl_delay = 0 #用防止master故障时,切换时slave有延迟,卡在那里切不过来。
  11. [server2]
  12. hostname=192.168.50.4
  13. master_binlog_dir=/storage/mysql/logs
  14. candidate_master=1
  15. check_repl_delay=0

七、配置文件测试

  1. masterha_check_ssh --global_conf=/etc/masterha/masterha_default.cnf --conf=/etc/masterha/app1.conf

确认可以看到所有的服务器上ssh测试通过

查看是不是具备跑masterha_manager,主从结构是不是OK

  1. masterha_check_repl --global_conf=/etc/masterha/masterha_default.cnf --conf=/etc/masterha/app1.conf

可以看到现有master/slave结构

MHA启动及关闭

  1. masterha_manager --global_conf=/etc/masterha/masterha_default.cnf --conf=/etc/masterha/app1.conf

注意:第一次启动,主库上的VIP不会自动绑定,需要手动调用init_vip.sh去绑定,主库发生故障切换会进行vip的飘逸。
至此,一主一从结构的MHA就配置好了。