1.使用adapter同步mysql到es
数据量大避免子表同步,使用主表驱动防止同步丢失数据 canal-admin 配置实例
sql 查询无法建立在视图上基于mysql binlog
子表判断条件 必须出现在select后
canal 重启 检查同步yml 查看错误日志排除错误
etlCondition 使用于全量同步 有入参的情况
#################################################
## mysql serverId , v1.0.26+ will autoGen
# canal.instance.mysql.slaveId=0
# enable gtid use true/false
canal.instance.gtidon=false
# position info
canal.instance.master.address=xxx.xxx.xxx.xxx:3306
canal.instance.master.journal.name=
canal.instance.master.position=
canal.instance.master.timestamp=
canal.instance.master.gtid=
# rds oss binlog
canal.instance.rds.accesskey=
canal.instance.rds.secretkey=
canal.instance.rds.instanceId=
# table meta tsdb info
canal.instance.tsdb.enable=false
#canal.instance.tsdb.url=jdbc:mysql://127.0.0.1:3306/canal_tsdb
#canal.instance.tsdb.dbUsername=canal
#canal.instance.tsdb.dbPassword=canal
#canal.instance.standby.address =
#canal.instance.standby.journal.name =
#canal.instance.standby.position =
#canal.instance.standby.timestamp =
#canal.instance.standby.gtid=
# username/password
canal.instance.dbUsername=xxx
canal.instance.dbPassword=xxx
canal.instance.defaultDatabaseName=xxx
canal.instance.connectionCharset = UTF-8
# enable druid Decrypt database password
canal.instance.enableDruid=false
#canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ==
# table regex
canal.instance.filter.regex=xxx
# table black regex
canal.instance.filter.black.regex="xxx.undo_log"
# table field filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
#canal.instance.filter.field=test1.t_product:id/subject/keywords,test2.t_company:id/name/contact/ch
# table field black filter(format: schema1.tableName1:field1/field2,schema2.tableName2:field1/field2)
#canal.instance.filter.black.field=test1.t_product:subject/product_image,test2.t_company:id/name/contact/ch
# mq config
canal.mq.topic=xxx
# dynamic topic route by schema or table regex
#canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*
canal.mq.partition=0
# hash partition config
#canal.mq.partitionsNum=3
#canal.mq.partitionHash=test.table:id^name,.*\\..*
#################################################
2.问题 数据过大产生 json 截断
参考:
https://www.cnblogs.com/javabg/p/14265558.html
3.yml 列子
LEFT JOIN (select request_id,GROUP_CONCAT(JSON_OBJECT(
'requestId',request_id,
'id',id)) as json
from xxx group by request_id) as x on 主表.request_id = x.request_id
优化
# 优化了canal 源码
CONCAT('[',(select GROUP_CONCAT(JSON_OBJECT(
'creator',creator,
'creatorId',creator_id,
'createTime',date_format(create_time,'%Y-%m-%dT%H:%i:%S+08:00'),
'updater',updater,
'updaterId',updater_id,
'updateTime',date_format(update_time,'%Y-%m-%dT%H:%i:%S+08:00'),
'id',id)) as json
from 子表 where enabled_flag = 1 and 主表.id = 子表.id),']') as xxxList
问题:
数据的抓取多条 之同步一条
两个yml id 相同 同步到同一个 idex
解决办法
不同服务 建立不同的同步服务 不共用一个 每个服务消耗700-800M的内存