一、索引管理 API
- 1. Open / Close Index API
- 2. Shrink Index API
在一个只有2个节点的集群上进行测试
- 结果：
创建源索引，4个主分片
插入数据
查看分片分布情况
- 结果：
尝试1： Shrink my_source_index
目标索引分片数3，不是4的因子，会失败
尝试2：目标索引分片数已修改成2
报错，因为源索引没有置成 readonly
将 my_source_index 设置为只读
尝试3：
报错，my_source_index所有分片必须都在一个节点
删除源索引，重新创建，将所有分片分配到 node-1
- 确保分片都在 node-1
插入数据
查看分片分布情况
- 结果：
设置源索引为只读
最后尝试：
清空从源索引复制来的只读属性
情况从源索引复制来的分片分配要求
查看目标索引分片个数、分布状态
- 结果：
目标索引可读，可插入数据
- 4. Rollover Index API
创建索引logs-000001，并起别名 logs_write
Add > 1000 documents to logs-000001
如果 logs_write 指向的索引是在7天以前创建的，或者包含1000个以上的文档，
则会自动创建 logs-000002 索引，并更新 logs_write 别名以指向 logs-000002.
返回值
PUT / with URI encoding:
Wait for a day to pass
- 5. Rollup Index API（X-Pack）
二、ILM：管理索引生命周期
三、Index Alias

一、索引管理 API

1. Open / Close Index API

POST /my_index/_close
POST /my_index/_open

索引关闭后，对集群的相关开销基本降低为 0；
索引数据不会被删除，但是无法被读取和搜索；
当需要的时候，可以重新打开；

2. Shrink Index API

Shrink Index：可以将索引的主分片数收缩到较小的值；
ES 5.x 后推出的一个新功能，使用场景：
- 索引保存的数据量比较小，需要重新设定 Primarey Shard 数量；
- 索引从 Hot 移动到 Warm 后，需要降低 Primary Shard 数量；
会使用和源索引相同的配置创建一个新的索引，仅仅降低主分片数；
- 源分片数必须是目标分片数的倍数。如果源分片数是素数，目标分片数只能是1；
- 如果文件系统支持硬链接，会将 Segments 硬链接到目标索引，所以性能好；
完成后，可以删除源索引；

:::tips

允许 Shrink Index 的前提条件：
- 源索引必须是只读的；
  - “index.blocks.write”: true
- 源索引中所有的分片必须在同一个节点上；
  - 分配到相同名称的节点：”index.routing.allocation.require._name”: “shrink_node_name”
  - 分配到 Hot 类型节点 (集群中此类型必须只有一台)：”index.routing.allocation.include.box_type”:”hot”
- 集群内健康状态必须是 Green；
- 目标索引不能存在；
- 如果将目标索引收缩成1个主分片，那么源索引中包含的文档数不能超过2147483519个，因为这是可以容纳在单个碎片中的最大文档数；
- 磁盘空间必须足够； :::
```
#源索引能被收缩的前提是，创建源索引时，就将所有分片部署到同台节点上
PUT /my_source_index/_settings
{
"settings": {
"index.routing.allocation.require._name": "shrink_node_name", 
"index.blocks.write": true 
}
}
```
```
# 清除从源索引复制的分配要求
# 清除从源索引复制的索引只读属性
POST /my_source_index/_shrink/my_target_index
{
"settings": {
"index.routing.allocation.require._name": null, 
"index.blocks.write": null 
}
}
```
举例： ```json

在一个只有2个节点的集群上进行测试
GET _cat/nodes

结果：
127.0.0.1 29 45 3 dilm - node-2 127.0.0.1 12 45 3 dilm * node-1

创建源索引，4个主分片

PUT my_source_index { “settings”: { “number_of_shards”: 4, “number_of_replicas”: 0 } }

插入数据

PUT my_source_index/_doc/1 { “key”:”value” }

查看分片分布情况

GET _cat/shards/my_source_index

结果：

my_source_index 2 p STARTED 0 0b 127.0.0.1 node-2 my_source_index 3 p STARTED 0 0b 127.0.0.1 node-1 my_source_index 1 p STARTED 0 230b 127.0.0.1 node-1 my_source_index 0 p STARTED 1 3.3kb 127.0.0.1 node-2

尝试1： Shrink my_source_index

目标索引分片数3，不是4的因子，会失败

POST my_source_index/_shrink/my_target_index { “settings”: { “index.number_of_replicas”: 0, “index.number_of_shards”: 3, “index.codec”: “best_compression” }, “aliases”: { “my_search_indices”: {} } }

尝试2：目标索引分片数已修改成2

报错，因为源索引没有置成 readonly

POST my_source_index/_shrink/my_target_index { “settings”: { “index.number_of_replicas”: 0, “index.number_of_shards”: 2, “index.codec”: “best_compression” }, “aliases”: { “my_search_indices”: {} } }

将 my_source_index 设置为只读

PUT /my_source_index/_settings { “settings”: { “index.blocks.write”: true } }

尝试3：

报错，my_source_index所有分片必须都在一个节点

删除源索引，重新创建，将所有分片分配到 node-1

DELETE my_source_index

确保分片都在 node-1

PUT my_source_index { “settings”: { “number_of_shards”: 4, “number_of_replicas”: 0, “index.routing.allocation.require._name”:”node-1” } }

插入数据

PUT my_source_index/_doc/1 { “key”:”value” }

查看分片分布情况

GET _cat/shards/my_source_index

结果：

my_source_index 2 p STARTED 0 230b 127.0.0.1 node-1 my_source_index 3 p STARTED 0 230b 127.0.0.1 node-1 my_source_index 1 p STARTED 0 230b 127.0.0.1 node-1 my_source_index 0 p STARTED 1 3.3kb 127.0.0.1 node-1

设置源索引为只读

PUT /my_source_index/_settings { “settings”: { “index.blocks.write”: true } }

最后尝试：

清空从源索引复制来的只读属性

情况从源索引复制来的分片分配要求

POST my_source_index/_shrink/my_target_index { “settings”: { “index.number_of_replicas”: 0, “index.number_of_shards”: 2, “index.codec”: “best_compression”, “index.blocks.write”: null, “index.routing.allocation.require._name”: null }, “aliases”: { “my_search_indices”: {} } }

查看目标索引分片个数、分布状态

GET _cat/shards/my_target_index

结果：

my_target_index 1 p STARTED 0 230b 127.0.0.1 node-1 my_target_index 0 p STARTED 1 3.4kb 127.0.0.1 node-1

目标索引可读，可插入数据

PUT my_target_index/_doc/2 { “key”:”value2” }


<a name="FhyWS"></a>
## 3. Split Index API

- Split Index：将源索引中每个 primary shard 扩大成两个或多个 primary shard；
:::tips

- 允许 Split Index 的前提条件：
   - 源索引必须是只读；
   - 集群健康状态必须是 Green；
   - 目标索引不存在；
   - 目标索引的主分片数量必须是源索引的倍数；
:::
```json
#创建源索引
PUT my_source_index
{
 "settings": {
   "number_of_shards": 4,
   "number_of_replicas": 0
 }
}

PUT my_source_index/_doc/1
{
  "key":"value"
}

GET _cat/shards/my_source_index

#尝试1：
#报错，必须是倍数
POST my_source_index/_split/my_target
{
  "settings": {
    "index.number_of_shards": 10
  }
}

#尝试2：
#报错，必须是只读
POST my_source_index/_split/my_target
{
  "settings": {
    "index.number_of_shards": 8
  }
}

#设置为只读
PUT /my_source_index/_settings
{
  "settings": {
    "index.blocks.write": true
  }
}

#最终尝试：
#清空从源索引复制的只读属性
POST my_source_index/_split/my_target_index
{
  "settings": {
    "index.number_of_shards": 8,
    "index.number_of_replicas":0,
    "index.blocks.write": null
  }
}

GET _cat/shards/my_target_index

# 写入成功
PUT my_target_index/_doc/1
{
  "key":"value"
}

4. Rollover Index API

Rollover Index：类似 Log4J 记录日志的方式，索引尺寸或时间超过一定值后，创建新的；
当满足一系列的条件，Rollover API 支持将一个 Alias 指向一个新的索引
- 存活的时间 / 最大文档数 / 最大的文件尺寸
应用场景
- 当一个索引数量过大
一般需要和 Index Lifecycle Management Policies 结合使用
- 只有调用 Rollover API 时，才会去做相应的检测。ES 并不会自动取监控这些索引。
Rollover Index 一般可以与索引模板结合使用，实现按一定条件自动创建索引；

举例：

当现有索引被认为太大或太旧时，Rollover API 会将别名滚动到新的索引。 ```json
创建索引logs-000001，并起别名 logs_write
PUT /logs-000001 { “aliases”: { “logs_write”: {} } }

Add > 1000 documents to logs-000001

POST /logs_write/_rollover { “conditions”: { “max_age”: “7d”, “max_docs”: 1000, “max_size”: “5gb” } }

如果 logs_write 指向的索引是在7天以前创建的，或者包含1000个以上的文档，

则会自动创建 logs-000002 索引，并更新 logs_write 别名以指向 logs-000002.

返回值

{ “acknowledged”: true, “shards_acknowledged”: true, “old_index”: “logs-000001”, “new_index”: “logs-000002”, “rolled_over”: true, “dry_run”: false, “conditions”: { “[max_age: 7d]”: false, “[max_docs: 1000]”: true } }


- 如果现有索引的名称以 - 和数字结尾，如：logs-000001，然后新索引的名称将遵循相同的模式，增加数字（logs-000002）。 无论旧索引名称如何，编号为零填充长度为6。
- 如果旧名称与此命名模式不匹配，则必须按照如下所示,指定新索引的名称：
```json
# POST /<alias>/_rollover/<target-index>
POST /my_alias/_rollover/my_new_index_name
{
  "conditions": {
    "max_age":   "7d",
    "max_docs":  1000,
    "max_size": "5gb"
  }
}

使用日期计算：根据索引滚动的日期来命名滚动索引是有用的技术，例如 logstash-2016.02.03。 Rollover API 支持日期，但要求索引名称以短横线 - 后跟一个数字，例如 logstash-2016.02.03-1，每次索引滚动时都会增加，例如： ```json
PUT / with URI encoding:
PUT /%3Clogs-%7Bnow%2Fd%7D-1%3E { “aliases”: { “logs_write”: {} } }

PUT logs_write/_doc/1 { “message”: “a dummy log” }

POST logs_write/_refresh

Wait for a day to pass

POST /logs_write/_rollover { “conditions”: { “max_docs”: “1” } }


- 索引名称对日期计算的支持：
   - 日期计算格式：<static_name { date_math_expr { date_format | time_zone } }>
   - 上述说明：
| **位置** | **说明** |
| :---: | :---: |
| static_name  | 是名称的 static text（ 静态文本）部分 |
| date_math_expr  | 是动态计算日期的动态 date math 表达式 |
| date_format  | 是计算日期应呈现的可选格式，默认是 YYYY.MM.dd |
| time_zone  | 是可选的时区，默认为 utc 。 |

   - 必须将 **date math** 索引名称表达式包含在尖括号中，并且所有的特殊字符都应进行 URI 编码。例如 :
```json
# GET /<logstash-{now/d}>/_search
GET /%3Clogstash-%7Bnow%2Fd%7D%3E/_search
{
  "query" : {
    "match": {
      "test": "data"
    }
  }
}

用于 date 索引的特殊字符必须按照如下 URI 编码：”< %3C“，”> %3E“，”/ %2F“，”{ %7B“，”} %7D“，”| %7C“，” + %2B“，”: %3A“，”, %2C“；
以下示例显示了不同形式索引表达式和它们解析的 final index names（最终索引名称），给定的当前时间是 2024 年 3 月 22 日 utc： | Expression | Resolves to | | —- | —- | | | logstash-2024.03.22 | | | logstash-2024.03.01 | | | logstash-2024.03 | | | logstash-2024.02 | | | logstash-2024.03.23 |

如果索引中需要使用{}需要进行转义处理：

<elastic\\{ON\\}-{now/M}> resolves to elastic{ON}-2024.03.01

滚动API支持 dry_run 模式，可以在不执行实际滚动的情况下检查请求条件： ```json PUT /logs-000001 { “aliases”: { “logs_write”: {} } }

POST /logs_write/_rollover?dry_run { “conditions” : { “max_age”: “7d”, “max_docs”: 1000, “max_size”: “5gb” } }

:::tips

- **Roll over a write index**
   - 6.5以后的版本支持 is_write_index，默认为 null，别名也只指向最新索引；
   - 当操作，is_write_index：true 时，执行 rollover 后，别名 logs 的写指向最新索引，查询指向所有rollover下的索引。
:::
```json
PUT my_logs_index-000001
{
  "aliases": {
    "logs": { "is_write_index": true } 
  }
}

PUT logs/_doc/1
{
  "message": "a dummy log"
}

POST logs/_refresh

POST /logs/_rollover
{
  "conditions": {
    "max_docs":   "1"
  }
}

PUT logs/_doc/2 
{
  "message": "a newer log"
}
#发现doc2被索引到了新索引 my_logs_index-000002

#可以查看别名logs
GET /logs
####部分内容
{
  "my_logs_index-000002": {
    "aliases": {
      "logs": { "is_write_index": true }
    }
  },
  "my_logs_index-000001": {
    "aliases": {
      "logs": { "is_write_index" : false }
    }
  }
}

5. Rollup Index API（X-Pack）

Rollup Index：对数据进行处理后，重新写入，减少数据量；

二、ILM：管理索引生命周期

可以根据你对索引性能、弹性、保留的需求，来配置索引生命周期管理（ILM）策略，ES就会自动根据策略管理索引；
例如，你可以使用索引的生命周期管理达到以下目的：
- 当索引达到指定大小后，创建一个新索引；
- 每天、每周或每月创建一个新索引，并归档以前的索引；
- 制定数据保留条件，来删除过时的旧索引；
当为 Beats 或 Logstash Elasticsearch 输出插件启用索引生命周期管理时，ILM 将自动配置。可以通过Kibana管理或 ILM api 修改默认策略。
《ILM：Manage the Index Lifecycle》

三、Index Alias

Index Aliases，索引别名，有点类似名称映射，一个索引别名可以映射多个真实索引，索引别名在定义时还支持filter，构成同一个索引，不同的视图。
《Index Alias 详解》

索引管理

一、索引管理 API

1. Open / Close Index API

2. Shrink Index API

在一个 只有2个节点的集群上进行测试

结果：

创建源索引，4个主分片

插入数据

查看分片分布情况

结果：

尝试1： Shrink my_source_index

目标索引分片数3，不是4的因子，会失败

尝试2：目标索引分片数已修改成2

报错，因为源索引没有置成 readonly

将 my_source_index 设置为只读

尝试3：

报错，my_source_index所有分片必须都在一个节点

删除源索引，重新创建，将所有分片分配到 node-1

确保分片都在 node-1

插入数据

查看分片分布情况

结果：

设置源索引为只读

最后尝试：

清空从源索引复制来的只读属性

情况从源索引复制来的分片分配要求

查看目标索引分片个数、分布状态

结果：

目标索引可读，可插入数据

4. Rollover Index API

创建索引logs-000001，并起别名 logs_write

Add > 1000 documents to logs-000001

如果 logs_write 指向的索引是在7天以前创建的，或者包含1000个以上的文档，

则会自动创建 logs-000002 索引，并更新 logs_write 别名以指向 logs-000002.

返回值

PUT / with URI encoding:

Wait for a day to pass

5. Rollup Index API（X-Pack）

二、ILM：管理索引生命周期

三、Index Alias

在一个只有2个节点的集群上进行测试