1. 集群健康 Cluster Health
1.1 查询整个集群健康情况
GET _cluster/health
{"cluster_name" : "my-application","status" : "yellow", #为 green 则代表健康没问题,如果是 yellow 或者 red 则是集群有问题"timed_out" : false, #是否有超时"number_of_nodes" : 1,"number_of_data_nodes" : 1,"active_primary_shards" : 70,"active_shards" : 70,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 73,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0,"task_max_waiting_in_queue_millis" : 0,"active_shards_percent_as_number" : 48.95104895104895 #集群分片的可用性百分比,如果为0则表示不可用}
集群的健康状况状况分为:绿色、黄色或红色
- green:每个索引的primary shard和replica shard都是active状态的
- yellow:每个索引的primary shard都是active状态的,但是部分replica shard不是active状态,处于不可用的状态
- red:不是所有索引的primary shard都是active状态的,部分索引有数据丢失了
1.2 查询集群中特定索引的健康情况
GET /_cluster/health/test1,test2
{"cluster_name" : "my-application","status" : "yellow","timed_out" : false,"number_of_nodes" : 1,"number_of_data_nodes" : 1,"active_primary_shards" : 5,"active_shards" : 5,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 5,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0,"task_max_waiting_in_queue_millis" : 0,"active_shards_percent_as_number" : 48.95104895104895}
2. 集群状态API
2.1 查看集群状态
GET /_cluster/state
响应信息有集群名称、集群状态的总压缩大小和集群状态本身的信息,可以对其进行筛选以仅检索感兴趣的部分,如下所述:
2.2 查询集群状态(过滤查询)
GET /_cluster/state/{metrics}/{indices}
{metrics}可以是以逗号分隔的以下指标列表:
| version | 显示群集状态版本 | |
|---|---|---|
| master_node | 显示主节点 | |
| nodes | 显示节点部分 | |
| routing_table | 显示路由表部分 | |
| metadata | 显示元数据部分 | |
| blocks | 显示块部分 |
例如:
返回索引test1、test2的集群状态信息中的version,master_node,nodes信息
GET /_cluster/state/version,master_node,nodes/test1,test2
返回索引test1,test2的所有集群状态信息
GET /_cluster/state/_all/test1,test2
返回集群状态的块信息
GET /_cluster/state/blocks
3. 集群统计API
Cluster Stats API允许从集群范围的角度检索统计信息。API返回基本的索引(index)度量信息(分片数目、存储大小、内存使用情况)以及组成集群的当前节点的信息(数量、角色、操作系统、jvm版本、内存使用情况、cpu和已安装的插件)。
3.1 查询集群统计信息
GET /_cluster/stats
{"_nodes" : {"total" : 1,"successful" : 1,"failed" : 0},"cluster_name" : "my-application","cluster_uuid" : "hO96xsN2RcaMUSrp86K6uQ","timestamp" : 1589438307055,"status" : "yellow","indices" : {"count" : 16,"shards" : {"total" : 70,"primaries" : 70,"replication" : 0.0,"index" : {"shards" : {"min" : 1,"max" : 5,"avg" : 4.375},"primaries" : {"min" : 1,"max" : 5,"avg" : 4.375},"replication" : {"min" : 0.0,"max" : 0.0,"avg" : 0.0}}},"docs" : {"count" : 13316,"deleted" : 2773},"store" : {"size_in_bytes" : 4761981},"fielddata" : {"memory_size_in_bytes" : 736,"evictions" : 0},"query_cache" : {"memory_size_in_bytes" : 0,"total_count" : 0,"hit_count" : 0,"miss_count" : 0,"cache_size" : 0,"cache_count" : 0,"evictions" : 0},"completion" : {"size_in_bytes" : 0},"segments" : {"count" : 34,"memory_in_bytes" : 212662,"terms_memory_in_bytes" : 146210,"stored_fields_memory_in_bytes" : 11152,"term_vectors_memory_in_bytes" : 0,"norms_memory_in_bytes" : 18944,"points_memory_in_bytes" : 1364,"doc_values_memory_in_bytes" : 34992,"index_writer_memory_in_bytes" : 0,"version_map_memory_in_bytes" : 0,"fixed_bit_set_memory_in_bytes" : 0,"max_unsafe_auto_id_timestamp" : -1,"file_sizes" : { }}},"nodes" : {"count" : {"total" : 1,"data" : 1,"coordinating_only" : 0,"master" : 1,"ingest" : 1},"versions" : ["6.6.0"],"os" : {"available_processors" : 6,"allocated_processors" : 6,"names" : [{"name" : "Windows 10","count" : 1}],"pretty_names" : [{"pretty_name" : "Windows 10","count" : 1}],"mem" : {"total_in_bytes" : 15243575296,"free_in_bytes" : 7408414720,"used_in_bytes" : 7835160576,"free_percent" : 49,"used_percent" : 51}},"process" : {"cpu" : {"percent" : 0},"open_file_descriptors" : {"min" : -1,"max" : -1,"avg" : 0}},"jvm" : {"max_uptime_in_millis" : 16215710,"versions" : [{"version" : "1.8.0_231","vm_name" : "Java HotSpot(TM) 64-Bit Server VM","vm_version" : "25.231-b11","vm_vendor" : "Oracle Corporation","count" : 1}],"mem" : {"heap_used_in_bytes" : 314424136,"heap_max_in_bytes" : 709558272},"threads" : 73},"fs" : {"total_in_bytes" : 339303460864,"free_in_bytes" : 306691145728,"available_in_bytes" : 306691145728},"plugins" : [{"name" : "analysis-ik","version" : "6.6.0","elasticsearch_version" : "6.6.0","java_version" : "1.8","description" : "IK Analyzer for Elasticsearch","classname" : "org.elasticsearch.plugin.analysis.ik.AnalysisIkPlugin","extended_plugins" : [ ],"has_native_controller" : false}],"network_types" : {"transport_types" : {"security4" : 1},"http_types" : {"security4" : 1}}}}
4. 集群中等待任务API(pending cluster tasks API)
API返回尚未执行的任何集群级更改(例如创建索引、更新映射、分配或失败碎片)的列表。
4.1 查询集群中等待任务列表
GET /_cluster/pending_tasks
通常这会返回一个空列表,因为集群级别的更改通常很快。但是,如果有任务排队,输出将如下所示:
{"tasks": [{"insert_order": 101,"priority": "URGENT","source": "create-index [foo_9], cause [api]","time_in_queue_millis": 86,"time_in_queue": "86ms"},{"insert_order": 46,"priority": "HIGH","source": "shard-started ([foo_2][1], node[tMTocMvQQgGCkj7QDHl3OA], [P], s[INITIALIZING]), reason [after recovery from shard_store]","time_in_queue_millis": 842,"time_in_queue": "842ms"},{"insert_order": 45,"priority": "HIGH","source": "shard-started ([foo_2][0], node[tMTocMvQQgGCkj7QDHl3OA], [P], s[INITIALIZING]), reason [after recovery from shard_store]","time_in_queue_millis": 858,"time_in_queue": "858ms"}]}
5. 修改集群路由
api允许对集群中单个碎片的分配进行手动更改。例如,可以显式地将shard从一个节点移动到另一个节点,可以取消分配,也可以显式地将未分配的shard分配给特定节点。
5.1 修改集群路由
POST /_cluster/reroute{"commands" : [{"move" : {"index" : "test", "shard" : 0,"from_node" : "node1", "to_node" : "node2"}},{"allocate_replica" : {"index" : "test", "shard" : 1,"node" : "node3"}}]}
6 集群配置更新API
6.1 查看集群配置
默认情况下,此API调用只返回已显式定义的设置
GET /_cluster/settings
如果没有配置集群,查询结果如下:
{"persistent" : { },"transient" : { }}
6.2 更新集群配置
对设置的更新可以是持久的,这意味着它们会在集群重新启动时仍然可用;也可以是暂时的,在完全重新启动集群后它们无法继续有效。
6.2.1 更新集群配置(持久更新)
PUT /_cluster/settings{"persistent" : {"indices.recovery.max_bytes_per_sec" : "50mb"}}
返回结果:
{"acknowledged" : true,"persistent" : {"indices" : {"recovery" : {"max_bytes_per_sec" : "50mb"}}},"transient" : { }}
6.2.2 更新集群配置(瞬变更新)
PUT /_cluster/settings?flat_settings=true{"transient" : {"indices.recovery.max_bytes_per_sec" : "20mb"}}
{"acknowledged" : true,"persistent" : { },"transient" : {"indices.recovery.max_bytes_per_sec" : "20mb"}}
6.2.2 重置集群配置
查询集群配置信息:GET /_cluster/settings
{"persistent" : {"indices" : {"recovery" : {"max_bytes_per_sec" : "50mb"}}},"transient" : {"indices" : {"recovery" : {"max_bytes_per_sec" : "20mb"}}}}
重置集群配置,指定配置项(支持通配符表示)用null赋值即可进行重置:
PUT /_cluster/settings{"persistent" : {"indices.recovery.max_bytes_per_sec" : null}}#通配符表示PUT /_cluster/settings{"transient" : {"indices.recovery.*" : null}}
6.3 集群配置优先级
集群配置的优先顺序为:
- 瞬态状态的集群配置(transient cluster settings)
- 持久状态的集群配置(persistent cluster settings)
- elasticsearch.yml配置文件中的配置
6.4 集群配置规则(推荐)
最好使用 【集群配置API】设置所有集群范围级别的设置,并且只对本地配置使用elasticsearch.yml文件进行配置。这样可以确保所有节点上的设置都相同。另一方面,如果使用配置文件意外地在不同节点上定义不同的设置,则很难注意到这些差异。
简言之,集群范围的配置使用API设置;节点级别的配置通过elasticsearch.yml配置。7. 获取集群配置
7.1 获取集群配置(只查显示定义的配置)
查询结果不包含:es自带的默认集群配置
查询结果: ```json { “persistent” : { }, “transient” : { } }GET /_cluster/settings
<a name="8oeKK"></a>#### 7.2 获取集群配置(只查显示定义的配置)```jsonGET /_cluster/settings?include_defaults=true
提示:参数include_defaults确保还返回未显式设置的设置。默认情况下,include_defaults设置为false。
8. 节点统计信息 Nodes Stats
默认返回节点的所有统计指标信息,也可以指定返回特定节点、特定统计指标信息。
8.1 统计指标解析

| indices | 索引有关大小、文档计数、索引和删除时间、搜索时间、字段缓存大小、合并和刷新的统计信息 | |
|---|---|---|
| fs | 文件系统信息、数据路径、可用磁盘空间、读/写状态 | |
| http | HTTP连接信息 | |
| jvm | JVM统计信息、内存池信息、垃圾收集、缓冲池、加载/卸载的类信息 | |
| os | 操作系统统计、平均负载、内存、交换 | |
| process | 进程统计、内存消耗、cpu使用率、打开的文件描述符 | |
| thread_pool | 每个线程池的统计信息,包括当前大小、队列和被拒绝的任务 | |
| transport | 集群通信中发送和接收字节的传输统计信息 | |
| breaker | 数据断路器统计 | |
| discovery | 节点发现统计信息 | |
| ingest | 有关预处理的统计信息 | |
| adaptive_selection | 关于自适应副本选择的统计信息 |
各统计指标对象属性的含义请查看:官方文档
8.2 查询所有节点统计信息
GET /_nodes/stats
8.3 查询指定节点统计信息(通过nodeId)
GET /_nodes/nodeId1,nodeId2/stats
8.4 查询节点统计信息(过滤查询)
#返回indicesGET /_nodes/stats/indices# 仅返回 os 和 processGET /_nodes/stats/os,process# 返回ip地址为10.0.0.1的节点的process信息GET /_nodes/10.0.0.1/stats/process
9. 查询节点信息 Nodes Info
9.1 节点信息详情
9.2 查询所有节点信息
GET /_nodes
9.3 查询指定节点信息
GET /_nodes/nodeId1,nodeId2# return just processGET /_nodes/process# same as aboveGET /_nodes/_all/process# return just jvm and process of only nodeId1 and nodeId2GET /_nodes/nodeId1,nodeId2/jvm,process# same as aboveGET /_nodes/nodeId1,nodeId2/info/jvm,process# return all the information of only nodeId1 and nodeId2GET /_nodes/nodeId1,nodeId2/_all
