1.副本 Unassigned shard问题处理
#查看异常shard信息curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED#单分片重新路由分片curl -XPOST 'localhost:9200/_cluster/reroute?retry_failed=true' -d '{"commands": [{"allocate_replica": {"index": "index-name","shard": shard-id,"node": "node-name"}}]}'
批量处理脚本,代码参考
#!/bin/bash#用于处理unassigned shard,副本分片因为#ALLOCATE_FAILE、超过重传次数而分配失败##NODE="10.116.106.2:9301"IFS=$'\n'for line in $(curl -s '10.116.106.1:9200/_cat/shards' | fgrep UNASSIGNED); doecho $lineINDEX=$(echo $line | (awk '{print $1}'))SHARD=$(echo $line | (awk '{print $2}'))curl -XPOST '10.116.106.1:9200/_cluster/reroute'?retry_failed -d '{"commands": [{"allocate_replica": {"index": "'$INDEX'","shard": '$SHARD',"node": "'$NODE'"}}]}'done
2.reindex API
用于重建索引,或者迁移集群数据
跨集群reindex需要增加配置项
reindex.remote.whitelist: “otherhost:9200, another:9200, 127.0.10.:9200, localhost:“
代码示例:
curl -XPOST http://localhost:9200/_reindex?pretty -d '{"source": {#批次大小"size": 10000,"index": "index-name"},"dest": {"index": "index-name-bak"}}' &#reindex remote clustercurl -XPOST http://localhost:9200/_reindex?pretty -d '{"source": {"size": 5000,"remote": {"host": "http://remote-cluster-ip:9200"},"index": "index-name"},"dest": {"index": "index-name-bak"}}' &
提升reindex性能的配置:
临时关闭刷新间隔
curl -X PUT "http://localhost:9200/_cluster/settings" -d'{"transient" : {"refresh_interval" : -1}}'
关闭索引副本
curl -XPUT localhost:9200/index-name/_settings?pretty -d '{"index" : {"number_of_replicas" : "1"}}'
3.ES节点维护指南
一般使用方案二的方法
- 方案一 ```bash
1.修改配置或者执行维护 2.关闭重平衡 curl -XPUT “http://10.116.106.35:9200/_cluster/settings“ -d’ { “transient” : { “cluster.routing.allocation.enable” : “none” } }’ 执行一次流同步(可选) 加速recovery POST _flush/synced 3.重启节点 4.开启重平衡 curl -XPUT “http://10.116.106.35:9200/_cluster/settings“ -d’ { “transient” : { “cluster.routing.allocation.enable” : “all” } }’ 5.确认集群恢复到green状态后,重复2-4步骤
- 方案二```bash1.执行动态exclude配置,迁移待维护节点数据curl -X PUT "http://localhost:9200/_cluster/settings" -d'{"transient" : {"cluster.routing.allocation.exclude._ip" : "node-ip"}}'注:exclude支持_ip,_name,并且支持通配符2.等待数据迁移完成3.维护完成后重启节点4.重新均衡数据curl -X PUT "http://localhost:9200/_cluster/settings" -d'{"transient" : {"cluster.routing.allocation.exclude._ip" : ""}}'
4.forcemerge
forcemerge对于集群的优化及长期稳定,起到很重要的作用。
用法:
forcemerge参数max_num_segments 1为full mergeonly_expunge_deletes 只合并删除的文档flush 完成后进行flush操作示例:curl -XPOST localhost:9200/index/_forcemerge?only_expunge_deletes=ture注:对于热索引,请谨慎使用
5.关闭、打开索引
post index-name/_closepost index-name/_open
6.迁移分片
迁移分片,对于负载不均衡等场景有效果
示例:
curl -XPOST 'http://localhost:9200/_cluster/reroute' -d '{"commands":[{"move":{"index":"logsfim","shard":0,"from_node":"page-node1","to_node":"page-node2"}}]}'
