如果一次性要查出来比如10万条数据,那么性能会很差,此时一般会采取用scoll滚动查询,一批一批的查,直到所有数据都查询完处理完
    使用scoll滚动搜索,可以先搜索一批数据,然后下次再搜索一批数据,以此类推,直到搜索出全部的数据来
    scoll搜索会在第一次搜索的时候,保存一个当时的视图快照,之后只会基于该旧的视图快照提供数据搜索,如果这个期间数据变更,是不会让用户看到的
    采用基于_doc进行排序的方式,性能较高
    每次发送scroll请求,我们还需要指定一个scoll参数,指定一个时间窗口,scroll=10m参数告诉Elasticsearch它应该保持“搜索上下文”存活多长时间,每次搜索请求只要在这个时间窗口内能完成就可以了

    1. GET /test_index/_search?scroll=10m
    2. {
    3. "query": {
    4. "match_all": {}
    5. },
    6. "sort": [ "_doc" ],
    7. "size": 3
    8. }
    {
      "_scroll_id": "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3",
      "took": 5,
      "timed_out": false,
      "_shards": {
        "total": 5,
        "successful": 5,
        "failed": 0
      },
      "hits": {
        "total": 10,
        "max_score": null,
        "hits": [
          {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "8",
            "_score": null,
            "_source": {
              "test_field": "test client 2"
            },
            "sort": [
              0
            ]
          },
          {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "6",
            "_score": null,
            "_source": {
              "test_field": "tes test"
            },
            "sort": [
              0
            ]
          },
          {
            "_index": "test_index",
            "_type": "test_type",
            "_id": "AVp4RN0bhjxldOOnBxaE",
            "_score": null,
            "_source": {
              "test_content": "my test"
            },
            "sort": [
              0
            ]
          }
        ]
      }
    }
    

    获得的结果会有一个scoll_id,下一次再发送scoll请求的时候,必须带上这个scoll_id

    GET /_search/scroll
    {
        "scroll": "10m", 
        "scroll_id" : "DnF1ZXJ5VGhlbkZldGNoBQAAAAAAACxeFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYBY0b25zVFlWWlRqR3ZJajlfc3BXejJ3AAAAAAAALF8WNG9uc1RZVlpUakd2SWo5X3NwV3oydwAAAAAAACxhFjRvbnNUWVZaVGpHdklqOV9zcFd6MncAAAAAAAAsYhY0b25zVFlWWlRqR3ZJajlfc3BXejJ3"
    }
    

    scoll,看起来挺像分页的,但是其实使用场景不一样。分页主要是用来一页一页搜索,给用户看的;scoll主要是用来一批一批检索数据,让系统进行处理的。
    如果超过存活时间,则会报错:

    {
      "error" : {
        "root_cause" : [
          {
            "type" : "search_context_missing_exception",
            "reason" : "No search context found for id [3386]"
          }
        ],
        "type" : "search_phase_execution_exception",
        "reason" : "all shards failed",
        "phase" : "query",
        "grouped" : true,
        "failed_shards" : [
          {
            "shard" : -1,
            "index" : null,
            "reason" : {
              "type" : "search_context_missing_exception",
              "reason" : "No search context found for id [3386]"
            }
          }
        ],
        "caused_by" : {
          "type" : "search_context_missing_exception",
          "reason" : "No search context found for id [3386]"
        }
      },
      "status" : 404
    }
    

    删除 scroll

    DELETE /_search/scroll
    {
      "scroll_id":"FGluY2x1ZGVfY29udGV4dF91dWlkDXF1ZXJ5QW5kRmV0Y2gBFGtoNk9tSE1CdmpuV2NWMzhtRDRkAAAAAAAADOcWbHB6bWlIZ1dUT21yYnczRVpmb0FDUQ=="
    }