Maxwell 是由美国 zendesk 开源,用 Java 编写的Mysql实时抓取软件,其抓取的原理也是基于 binlog。
Maxwell 和 canal 工具对比
- Maxwell 没有类似 canal 的 server+client模式,只有一个server把数据发送到消息队列或redis,如果需要多个实例,通过指定不同配置文件启动多个进程。
- Maxwell 有一个亮点功能,就是canal只能抓取最新数据,对已存在的历史数据没有办法处理。而 Maxwell 有一个 bootstrap 功能,可以直接引导出完整的历史数据用于初始化,非常好用。
- Maxwell 不能直接支持 HA,但是它支持断点还原,即错误解决后重启可以继续从断点读取数据。
- Maxwell 只支持 json 格式,而 Canal 如果用 Server+client 模式的话,可以自定义格式。
- Maxwell 比 Canal 更加轻量级。
执行不同操作,Maxwell 和 Canal 数据格式对比
执行 insert 测试语句
| Cancal | maxwell | | —- | —- | | {INSERT INTO z_user_info VALUES(30,'zhang3','13810001010'),(31,'li4','1389999999');
“data”: [
{
“id”: “30”,
“user_name”: “zhang3”,
“tel”: “13810001010”
},
{
“id”: “31”,
“user_name”: “li4”,
“tel”: “1389999999”
}
],
“database”: “gmall-2020-04”,
“es”: 1589385314000,
“id”: 2,
“isDdl”: false,
“mysqlType”: {
“id”: “bigint(20)”,
“user_name”: “varchar(20)”,
“tel”: “varchar(20)”
},
“old”: null,
“pkNames”: [
“id”
],
“sql”: “”,
“sqlType”: {
“id”: -5,
“user_name”: 12,
“tel”: 12
},
“table”: “z_user_info”,
“ts”: 1589385314116,
“type”: “INSERT”
} | {
“database”: “gmall-2020-04”,
“table”: “z_user_info”,
“type”: “insert”,
“ts”: 1589385314,
“xid”: 82982,
“xoffset”: 0,
“data”: {
“id”: 30,
“user_name”: “zhang3”,
“tel”: “13810001010”
}
}
{
“database”: “gmall-2020-04”,
“table”: “z_user_info”,
“type”: “insert”,
“ts”: 1589385314,
“xid”: 82982,
“commit”: true,
“data”: {
“id”: 31,
“user_name”: “li4”,
“tel”: “1389999999”
}
} |
执行 update 操作
UPDATE z_user_info SET user_name='wang55' WHERE id IN(30,31)
| Cancal | maxwell |
|---|---|
| { “data”: [ { “id”: “30”, “user_name”: “wang55”, “tel”: “13810001010” }, { “id”: “31”, “user_name”: “wang55”, “tel”: “1389999999” } ], “database”: “gmall-2020-04”, “es”: 1589385508000, “id”: 3, “isDdl”: false, “mysqlType”: { “id”: “bigint(20)”, “user_name”: “varchar(20)”, “tel”: “varchar(20)” }, “old”: [ { “user_name”: “zhang3” }, { “user_name”: “li4” } ], “pkNames”: [ “id” ], “sql”: “”, “sqlType”: { “id”: -5, “user_name”: 12, “tel”: 12 }, “table”: “z_user_info”, “ts”: 1589385508676, “type”: “UPDATE” } |
{ “database”: “gmall-2020-04”, “table”: “z_user_info”, “type”: “update”, “ts”: 1589385508, “xid”: 83206, “xoffset”: 0, “data”: { “id”: 30, “user_name”: “wang55”, “tel”: “13810001010” }, “old”: { “user_name”: “zhang3” } } { “database”: “gmall-2020-04”, “table”: “z_user_info”, “type”: “update”, “ts”: 1589385508, “xid”: 83206, “commit”: true, “data”: { “id”: 31, “user_name”: “wang55”, “tel”: “1389999999” }, “old”: { “user_name”: “li4” } } |
执行 delete 操作
DELETE FROM z_user_info WHERE id IN(30,31)
| Cancal | maxwell |
|---|---|
| { “data”: [ { “id”: “30”, “user_name”: “wang55”, “tel”: “13810001010” }, { “id”: “31”, “user_name”: “wang55”, “tel”: “1389999999” } ], “database”: “gmall-2020-04”, “es”: 1589385644000, “id”: 4, “isDdl”: false, “mysqlType”: { “id”: “bigint(20)”, “user_name”: “varchar(20)”, “tel”: “varchar(20)” }, “old”: null, “pkNames”: [ “id” ], “sql”: “”, “sqlType”: { “id”: -5, “user_name”: 12, “tel”: 12 }, “table”: “z_user_info”, “ts”: 1589385644829, “type”: “DELETE” } |
{ “database”: “gmall-2020-04”, “table”: “z_user_info”, “type”: “delete”, “ts”: 1589385644, “xid”: 83367, “xoffset”: 0, “data”: { “id”: 30, “user_name”: “wang55”, “tel”: “13810001010” } } { “database”: “gmall-2020-04”, “table”: “z_user_info”, “type”: “delete”, “ts”: 1589385644, “xid”: 83367, “commit”: true, “data”: { “id”: 31, “user_name”: “wang55”, “tel”: “1389999999” } } |
总结数据特点
日志结构
canal 每一条SQL会产生一条日志,如果该条Sql影响了多行数据,则已经会通过集合的方式归集在这条日志中。(即使是一条数据也会是数组结构)
maxwell 以影响的数据为单位产生日志,即每影响一条数据就会产生一条日志。如果想知道这些日志是否是通过某一条 sql 产生的可以通过xid进行判断,相同的xid的日志来自同一sql。
canal 数据中会带入表结构,maxwell 相对简洁。
数字类型
当原始数据是数字类型时, maxwell 会尊重原始数据的类型不增加双引,变为字符串。canal一律转换为字符串。
