1:定义_ingest/pipeline

  1. PUT _ingest/pipeline/test_filter
  2. {
  3. "processors": [
  4. {
  5. "gsub": {
  6. "field": "name",
  7. "pattern": """(^(lesson)\\s*\\d+$)|(^(Module)\\s*\\d+$)""",
  8. "replacement": ""
  9. }
  10. }
  11. ]
  12. }

2:定义ES mapping

PUT test
{
  "aliases": {
    "test_a": {}
  },
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "name": {
        "type": "keyword"
      }
    }
  },
  "settings": {
    "index.default_pipeline": "test_filter", # 这里指定
    "index": {
      "number_of_shards": "2",
      "max_result_window": "100000000",
      "analysis": {
        "analyzer": {
          "ik": {
            "tokenizer": "ik_max_word"
          }
      },
      "number_of_replicas": "1"
    }
  }
}

3:数据处理,重刷

POST /_reindex
{
  "source": {
    "index": "test"
  },
  "dest": {
    "index": "test2",
    "pipeline": "test_filter"
  }
}

其他使用方式可见:官网 ingest pipelines