前言:在生产环境中,DBA经常遇到开发要求在生产环境增加一个字段,添加一个索引什么的,如果是5.6、5.7那还好一些,因为可以online DDL,但是原生的online DDL也是有一定条件的,我们会在后面提到,更多情况下,我们会选择使用pt-osc工具来执行DDL

一、PT-OSC工具使用

pt-osc是pt-online-schema-change的简写,是percona-toolkit工具集中的重要一员,是MySQL5.5及以前版本重要的维护工具之一,下面我们来看下pt-osc工具的使用方法

  1. [root@Mariadb-04 mysql]# pt-online-schema-change --help
  2. pt-online-schema-change alters a table's structure without blocking reads or
  3. writes. Specify the database and table in the DSN. Do not use this tool before
  4. reading its documentation and checking your backups carefully. For more
  5. details, please use the --help option, or try 'perldoc
  6. /usr/bin/pt-online-schema-change' for complete documentation.
  7. Usage: pt-online-schema-change [OPTIONS] DSN
  8. Options:
  9. --alter=s The schema modification, without the ALTER
  10. TABLE keywords
  11. --alter-foreign-keys-method=s How to modify foreign keys so they reference
  12. the new table
  13. --[no]analyze-before-swap Execute ANALYZE TABLE on the new table
  14. before swaping with the old one (default yes)
  15. --ask-pass Prompt for a password when connecting to
  16. MySQL
  17. --charset=s -A Default character set
  18. --[no]check-alter Parses the --alter specified and tries to
  19. warn of possible unintended behavior (
  20. default yes)
  21. --check-interval=m Sleep time between checks for --max-lag (
  22. default 1). Optional suffix s=seconds, m=
  23. minutes, h=hours, d=days; if no suffix, s is
  24. used.
  25. --[no]check-plan Check query execution plans for safety (
  26. default yes)
  27. --[no]check-replication-filters Abort if any replication filter is set on
  28. any server (default yes)
  29. --check-slave-lag=s Pause the data copy until this replica's lag
  30. is less than --max-lag
  31. --chunk-index=s Prefer this index for chunking tables
  32. --chunk-index-columns=i Use only this many left-most columns of a --
  33. chunk-index
  34. --chunk-size=z Number of rows to select for each chunk
  35. copied (default 1000)
  36. --chunk-size-limit=f Do not copy chunks this much larger than the
  37. desired chunk size (default 4.0)
  38. --chunk-time=f Adjust the chunk size dynamically so each
  39. data-copy query takes this long to execute (
  40. default 0.5)
  41. --config=A Read this comma-separated list of config
  42. files; if specified, this must be the first
  43. option on the command line
  44. --critical-load=A Examine SHOW GLOBAL STATUS after every
  45. chunk, and abort if the load is too high (
  46. default Threads_running=50)
  47. --database=s -D Connect to this database
  48. --default-engine Remove ENGINE from the new table
  49. --defaults-file=s -F Only read mysql options from the given file
  50. --[no]drop-new-table Drop the new table if copying the original
  51. table fails (default yes)
  52. --[no]drop-old-table Drop the original table after renaming it (
  53. default yes)
  54. --[no]drop-triggers Drop triggers on the old table. --no-drop-
  55. triggers forces --no-drop-old-table (default
  56. yes)
  57. --dry-run Create and alter the new table, but do not
  58. create triggers, copy data, or replace the
  59. original table
  60. --execute Indicate that you have read the
  61. documentation and want to alter the table
  62. --force This options bypasses confirmation in case
  63. of using alter-foreign-keys-method = none ,
  64. which might break foreign key constraints
  65. --help Show help and exit
  66. --host=s -h Connect to host
  67. --max-flow-ctl=f Somewhat similar to --max-lag but for PXC
  68. clusters
  69. --max-lag=m Pause the data copy until all replicas' lag
  70. is less than this value (default 1s).
  71. Optional suffix s=seconds, m=minutes, h=
  72. hours, d=days; if no suffix, s is used.
  73. --max-load=A Examine SHOW GLOBAL STATUS after every
  74. chunk, and pause if any status variables are
  75. higher than their thresholds (default
  76. Threads_running=25)
  77. --new-table-name=s New table name before it is swapped. %T is
  78. replaced with the original table name (
  79. default %T_new)
  80. --password=s -p Password to use when connecting
  81. --pid=s Create the given PID file
  82. --plugin=s Perl module file that defines a
  83. pt_online_schema_change_plugin class
  84. --port=i -P Port number to use for connection
  85. --print Print SQL statements to STDOUT
  86. --progress=a Print progress reports to STDERR while
  87. copying rows (default time,30)
  88. --quiet -q Do not print messages to STDOUT (disables --
  89. progress)
  90. --recurse=i Number of levels to recurse in the hierarchy
  91. when discovering replicas
  92. --recursion-method=a Preferred recursion method for discovering
  93. replicas (default processlist,hosts)
  94. --set-vars=A Set the MySQL variables in this comma-
  95. separated list of variable=value pairs
  96. --sleep=f How long to sleep (in seconds) after copying
  97. each chunk (default 0)
  98. --socket=s -S Socket file to use for connection
  99. --statistics Print statistics about internal counters
  100. --[no]swap-tables Swap the original table and the new, altered
  101. table (default yes)
  102. --tries=a How many times to try critical operations
  103. --user=s -u User for login if not current user
  104. --version Show version and exit
  105. --[no]version-check Check for the latest version of Percona
  106. Toolkit, MySQL, and other programs (default
  107. yes)
  108. Option types: s=string, i=integer, f=float, h/H/a/A=comma-separated list, d=DSN, z=size, m=time
  109. Rules:
  110. --dry-run and --execute are mutually exclusive.
  111. This tool accepts additional command-line arguments. Refer to the SYNOPSIS and usage information for details.
  112. DSN syntax is key=value[,key=value...] Allowable DSN keys:
  113. KEY COPY MEANING
  114. === ==== =============================================
  115. A yes Default character set
  116. D yes Database for the old and new table
  117. F yes Only read default options from the given file
  118. P yes Port number to use for connection
  119. S yes Socket file to use for connection
  120. h yes Connect to host
  121. p yes Password to use when connecting
  122. t no Table to alter
  123. u yes User for login if not current user
  124. If the DSN is a bareword, the word is treated as the 'h' key.
  125. Options and values after processing arguments:
  126. --alter (No value)
  127. --alter-foreign-keys-method (No value)
  128. --analyze-before-swap TRUE
  129. --ask-pass FALSE
  130. --charset (No value)
  131. --check-alter TRUE
  132. --check-interval 1
  133. --check-plan TRUE
  134. --check-replication-filters TRUE
  135. --check-slave-lag (No value)
  136. --chunk-index (No value)
  137. --chunk-index-columns (No value)
  138. --chunk-size 1000
  139. --chunk-size-limit 4.0
  140. --chunk-time 0.5
  141. --config /etc/percona-toolkit/percona-toolkit.conf,/etc/percona-toolkit/pt-online-schema-change.conf,/root/.percona-toolkit.conf,/root/.pt-online-schema-change.conf
  142. --critical-load Threads_running=50
  143. --database (No value)
  144. --default-engine FALSE
  145. --defaults-file (No value)
  146. --drop-new-table TRUE
  147. --drop-old-table TRUE
  148. --drop-triggers TRUE
  149. --dry-run FALSE
  150. --execute FALSE
  151. --force FALSE
  152. --help TRUE
  153. --host (No value)
  154. --max-flow-ctl (No value)
  155. --max-lag 1
  156. --max-load Threads_running=25
  157. --new-table-name %T_new
  158. --password (No value)
  159. --pid (No value)
  160. --plugin (No value)
  161. --port (No value)
  162. --print FALSE
  163. --progress time,30
  164. --quiet FALSE
  165. --recurse (No value)
  166. --recursion-method processlist,hosts
  167. --set-vars
  168. --sleep 0
  169. --socket (No value)
  170. --statistics FALSE
  171. --swap-tables TRUE
  172. --tries (No value)
  173. --user (No value)
  174. --version FALSE
  175. --version-check TRUE

乍一看,参数是不是眼花缭乱?其实要掌握的参数其实也不多:

  • —user 指定执行的用户

  • —ask-pass 提示输入密码

  • —host 指定连接主机

  • —alter 后面跟上需要执行的sql

  • D 指定DBNAME

  • t 指定table name

  • —print 打印输出信息

  • —execute 提示输入密码后立即执行sql

二、实战一把

正好,今天查看慢日志,有个SQL需要优化,看了下SQL以后发现目前的执行计划走的索引只匹配到了一个字段,需要根据条件另外创建一个索引,那么我们可以通过pt-osc工具来添加此索引。

  1. [root@Mariadb-04 mysql]# pt-online-schema-change --user=root --ask-pass --host=localhost --alter "add key idx_chnlid_docstatus_dockind_docpubtime(chnlid,docstatus,dockind,docpubtime)" D=xx,t=xx --print --execute
  2. Enter MySQL password:
  3. ....
  4. Operation, tries, wait:
  5. analyze_table, 10, 1
  6. copy_rows, 10, 0.25
  7. create_triggers, 10, 1
  8. drop_triggers, 10, 1
  9. swap_tables, 10, 1
  10. update_foreign_keys, 10, 1
  11. Altering `xx`.`xx`...
  12. Creating new table...
  13. CREATE TABLE `xxxx`.`_xxxx_new` (
  14. `xxx` int(11) NOT NULL,
  15. `xx` int(11) NOT NULL DEFAULT '0',
  16. `xx` datetime DEFAULT NULL,
  17. `xx` varchar(50) DEFAULT NULL,
  18. `xx` datetime DEFAULT NULL,
  19. `xx` int(11) DEFAULT '1',
  20. `xx` datetime DEFAULT NULL,
  21. `xx` int(11) DEFAULT NULL,
  22. `xx` int(11) DEFAULT NULL,
  23. `xx` int(11) DEFAULT '0',
  24. `xx` int(11) NOT NULL DEFAULT '0',
  25. `xxx` int(11) NOT NULL DEFAULT '0',
  26. `xx` datetime DEFAULT NULL,
  27. `xx` int(11) DEFAULT '0',
  28. `xx` varchar(200) DEFAULT NULL,
  29. `xx` int(2) DEFAULT '0',
  30. `xx` int(11) DEFAULT NULL,
  31. PRIMARY KEY (`xx`),
  32. KEY `xx` (`xxx`,`xxx`) USING BTREE
  33. ) ENGINE=InnoDB DEFAULT CHARSET=utf8
  34. Created new table xx._xx_new OK.
  35. Altering new table...
  36. ALTER TABLE `xx`.`_xx_new` add key idx_chnlid_docstatus_dockind_docpubtime(xx,xx,xx,xx)
  37. Altered `xx`.`_xx_new` OK.
  38. 2018-05-16T17:29:51 Creating triggers...
  39. CREATE TRIGGER `pt_osc_xx_xx_del` AFTER DELETE ON `xx`.`xx` FOR EACH ROW DELETE IGNORE FROM `xx`.`_xx_new` WHERE `xx`.`_xx_new`.`recid` <=> OLD.`recid`
  40. CREATE TRIGGER `pt_osc_xx_xx_upd` AFTER UPDATE ON `xx`.`xx` FOR EACH ROW REPLACE INTO `xx`.`_xx_new` (..........)
  41. CREATE TRIGGER `pt_osc_xx_xx_ins` AFTER INSERT ON `xx`.`xx` FOR EACH ROW REPLACE INTO `xx`.`_xx_new` (..........)
  42. 2018-05-16T17:29:51 Created triggers OK.
  43. 2018-05-16T17:29:51 Copying approximately 6788869 rows...
  44. INSERT LOW_PRIORITY IGNORE INTO `xx`.`_xx_new` (............. FROM `xx`.`xx` FORCE INDEX(`PRIMARY`) WHERE ((`recid` >= ?)) AND ((`recid` <= ?)) LOCK IN SHARE MODE /*pt-online-schema-change 36224 copy nibble*/
  45. SELECT /*!40001 SQL_NO_CACHE */ `recid` FROM `xx`.`xx` FORCE INDEX(`PRIMARY`) WHERE ((`recid` >= ?)) ORDER BY `recid` LIMIT ?, 2 /*next chunk boundary*/
  46. Copying `xx`.`xx`: 5% 08:43 remain
  47. Copying `xx`.`xx`: 11% 07:30 remain
  48. Copying `xx`.`xx`: 14% 09:59 remain
  49. Copying `xx`.`xx`: 18% 09:25 remain
  50. Copying `xx`.`xx`: 22% 10:23 remain
  51. Copying `xx`.`xx`: 23% 12:03 remain
  52. Copying `xx`.`xx`: 27% 12:09 remain
  53. Copying `xx`.`xx`: 32% 10:35 remain
  54. Copying `xx`.`xx`: 32% 11:32 remain
  55. Copying `xx`.`xx`: 33% 12:28 remain
  56. Copying `xx`.`xx`: 35% 12:34 remain
  57. Copying `xx`.`xx`: 41% 10:35 remain
  58. Copying `xx`.`xx`: 46% 09:13 remain
  59. Copying `xx`.`xx`: 48% 09:08 remain
  60. Copying `xx`.`xx`: 53% 08:02 remain
  61. Copying `xx`.`xx`: 54% 08:20 remain
  62. Copying `xx`.`xx`: 57% 07:52 remain
  63. Copying `xx`.`xx`: 60% 07:25 remain
  64. Copying `xx`.`xx`: 62% 07:23 remain
  65. Copying `xx`.`xx`: 66% 06:24 remain
  66. Copying `xx`.`xx`: 71% 05:23 remain
  67. Copying `xx`.`xx`: 74% 04:47 remain
  68. Copying `xx`.`xx`: 75% 04:52 remain
  69. Copying `xx`.`xx`: 76% 04:44 remain
  70. Copying `xx`.`xx`: 80% 03:46 remain
  71. Copying `xx`.`xx`: 82% 03:33 remain
  72. Copying `xx`.`xx`: 86% 02:47 remain
  73. Copying `xx`.`xx`: 86% 02:52 remain
  74. Copying `xx`.`xx`: 87% 02:47 remain
  75. Copying `xx`.`xx`: 90% 02:00 remain
  76. Copying `xx`.`xx`: 94% 01:15 remain
  77. Copying `xx`.`xx`: 95% 01:04 remain
  78. Copying `xx`.`xx`: 97% 00:30 remain
  79. 2018-05-16T17:53:32 Copied rows OK.
  80. 2018-05-16T17:53:32 Analyzing new table...
  81. 2018-05-16T17:53:33 Swapping tables...
  82. RENAME TABLE `xx`.`xx` TO `xx`.`_xx_old`, `xx`.`_xx_new` TO `xx`.`xx`
  83. 2018-05-16T17:53:33 Swapped original and new tables OK.
  84. 2018-05-16T17:53:33 Dropping old table...
  85. DROP TABLE IF EXISTS `xx`.`_xx_old`
  86. 2018-05-16T17:53:37 Dropped old table `xx`.`_xx_old` OK.
  87. 2018-05-16T17:53:37 Dropping triggers...
  88. DROP TRIGGER IF EXISTS `xx`.`pt_osc_xx_xx_del`;
  89. DROP TRIGGER IF EXISTS `xx`.`pt_osc_xx_xx_upd`;
  90. DROP TRIGGER IF EXISTS `xx`.`pt_osc_xx_xx_ins`;
  91. 2018-05-16T17:53:37 Dropped triggers OK.
  92. Successfully altered `xx`.`xx`.

以上就是我在生产环境利用pt-osc做的在线DDL更改,下面我们根据log来分析一下pt-osc的原理

三、原理分析

从上面的log我们可以看出整体流程是这样的

  • 首先根据原表DDL定义创建一张同样表结构的临时表,并且命名规则为_原表名_new

  • 在新建的表上创建需要添加的索引

  • 在原表上创建三个触发器,分别对应曾、改、删

  • 将原表的数据以insert into select …的方式拷贝到新表

  • 数据拷贝完成以后会将原表重命名为_原表名_old,将新表名重命名为原表名

  • 删除表_原表名_old

  • 删除触发器

  • 成功

总的来说,整个过程还是比较清晰的

四、pt-osc使用限制

  • 外键问题,因为涉及到表的重命名,因此如果要修改的表是其他表的外键,那么修改起来是有问题的,幸好pt-osc提供了—alter-foreign-keys-method选项,优先采用rebuild_constrains,也可以指定使用drop_swap

  • 如果表上有触发器,那么我们无法采用pt-osc来做在线ddl,因为MySQL规定同一个表不能存在多个相同定义的触发器

  • 从库延迟,如果需要变更的表数据量比较多的话,会造成从库延迟的问题,生产环境中我们需要注意这个问题

  • 磁盘容量的问题也是我们需要注意的,如果服务器磁盘空间比较紧张,并且需要变更的表又是大表,我们需要提前估算好容量

五、原生online ddl的限制

对于原生的online ddl支持如下

  • MySQL5.6

    • 辅助索引的创建和删除

    • 改变自增长id值

    • 添加或删除外键

    • 列的重命名

  • MySQL5.7

    • 增加支持在线更改索引名

    • 增加在线增加varchar列长度

  • 如果涉及到表空间的改变,那就不支持在线online ddl,是在不确定,可以参考下官方手册