ES对比mysql数据库:索引表就是表,每个文档就是mysql的每一行,每个列名就是field,也就是一个字段。

常用查询DSL及对应的RestClient:

  1. //测试
  2. private RestHighLevelClient client;
  3. @BeforeEach
  4. void setUp() {
  5. this.client = new RestHighLevelClient(RestClient.builder(
  6. HttpHost.create("http://121.36.164.132:9200")
  7. ));
  8. }
  9. @AfterEach
  10. void tearDown() throws IOException {
  11. this.client.close();
  12. }
  13. //或者在启动类添加Bean
  14. @Bean
  15. public RestHighLevelClient client(){
  16. RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(
  17. HttpHost.create("http://121.36.164.132:9200")
  18. ));
  19. return client;
  20. }

基本查询:

查询所有:match_all—无条件查询
image.png
对应java代码:

@Test
    void testMatchAll() throws IOException {
        //1.创建请求
        SearchRequest request = new SearchRequest("hotel");
        //2.组织条件
        request.source().query(QueryBuilders.matchAllQuery());
        //3.发送请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析响应
        handleResponse(response);
    }

单字段查询:match——只能有一个列
image.png

@Test
    void testMatch() throws IOException {
        //1.
        SearchRequest request = new SearchRequest("hotel");
        //2.
        request.source().query(QueryBuilders.matchQuery
                ("all", "外滩"));
        //3.
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.
        handleResponse(response);
    }

多字段查询:multi_match
image.png

@Test
    void testMultiMatch() throws IOException {
        //1.
        SearchRequest request = new SearchRequest("hotel");
        //2.
        request.source().query(QueryBuilders.multiMatchQuery(
                "外滩如家", "name", "business"
        ));
        //3.请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //4.解析
        handleResponse(response);
    }

精准查询:

term查询的字段一定是keyword,因为是精准查询不能分词
image.png
range范围查询需要用到:# gte 大于等于 gt 大于 lte 小于等于 lt 小于
image.png

@Test
    void testJingZhun() throws IOException {
        //1.
        SearchRequest request = new SearchRequest("hotel");
        //2.
        //term
        // request.source().query(QueryBuilders.termQuery
        //        ("city","上海"));
        //range
        request.source().query(QueryBuilders.rangeQuery("price")
                .gte(1000).lte(2000));
        //请求
        SearchResponse response = client.search(request, RequestOptions.DEFAULT);
        //解析
        handleResponse(response);
    }

地理坐标查询:

// geo_bounding_box查询
GET /indexName/_search
{
  "query": {
    "geo_bounding_box": {
      "FIELD": {
        "top_left": { // 左上点
          "lat": 31.1,
          "lon": 121.5
        },
        "bottom_right": { // 右下点
          "lat": 30.9,
          "lon": 121.7
        }
      }
    }
  }
}

附件搜索:以点为圆心搜索——常用——geo_distance
image.png

            //距离排序
            String location = params.getLocation();
            if (location != null && !location.equals("")) {
                request.source().sort(SortBuilders
                        .geoDistanceSort("location", new GeoPoint(location))
                        .order(SortOrder.ASC)
                        .unit(DistanceUnit.KILOMETERS)
                );
            }

复合查询:(function_score/bool)

function_score:——条件只能是filter

  • 原始查询条件:query部分,基于这个条件搜索文档,并且基于BM25算法给文档打分,原始算分(query score)
  • 过滤条件:filter部分,符合该条件的文档才会重新算分
  • 算分函数:符合filter条件的文档要根据这个函数做运算,得到的函数算分(function score),有四种函数
    • weight:函数结果是常量
    • field_value_factor:以文档中的某个字段值作为函数结果
    • random_score:以随机数作为函数结果
    • script_score:自定义算分函数算法
  • 运算模式:算分函数的结果、原始查询的相关性算分,两者之间的运算方式,包括:
    • multiply:相乘
    • replace:用function score替换query score
    • 其它,例如:sum、avg、max、min

image.png

#fuction score——算分函数
GET /hotel/_search
{
  "query": {
    "function_score": {
      "query": {
        "term": {
          "name": {
            "value": "如家"
          }
        }
      },
      "functions": [
        {
          "filter": {
            "term": {
              "city": "上海"
            }
          },
          "weight": 10
        }
      ],
      "boost_mode": "multiply"
    }
  }
}
//function_score
            FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
                    // 原始查询,相关性算分的查询
                    boolQuery,
                    // function score的数组
                    new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
                            // 其中的一个function score 元素
                            new FunctionScoreQueryBuilder.FilterFunctionBuilder(
                                    //过滤条件
                                    QueryBuilders.termQuery("isAD", true),
                                    //算分函数
                                    ScoreFunctionBuilders.weightFactorFunction(10)
                            )
                    }
            );
//boolQuery之前就加了其他的条件了。

function_score执行流程
1、先根据原始条件查询搜索文档,根据相关性计算得分,也就是原始算分
2、跟过过滤条件(filter)去过滤文档
3、根据过滤条件的文档,基于算分函数运算,得到函数算分
4、将原始算分和函数算分基于运算模式运算,得到最终得分,最终得分高的排前面。

Bool查询——过滤条件多种

多种条件:

  • must:必须匹配每个子查询,类似“与”
  • should:选择性匹配子查询,类似“或”
  • must_not:必须不匹配,不参与算分,类似“非”
  • filter:必须匹配,不参与算分
    GET /hotel/_search
    {
    "query": {
      "bool": {
        "must": [
          {"term": {"city": "上海" }}
        ],
        "should": [
          {"term": {"brand": "皇冠假日" }},
          {"term": {"brand": "华美达" }}
        ],
        "must_not": [
          { "range": { "price": { "lte": 500 } }}
        ],
        "filter": [
          { "range": {"score": { "gte": 45 } }}
        ]
      }
    }
    }
    
    ```java @Test void testBool() throws IOException {
      //1
      SearchRequest request = new SearchRequest("hotel");
      //2.
      request.source().query(QueryBuilders.boolQuery().must(
                      QueryBuilders.termQuery("name", "如家")
              ).mustNot(
                      QueryBuilders.termQuery("city", "上海")
              ).should(
                      QueryBuilders.termQuery("brand", "皇冠假日")
              ).should(
                      QueryBuilders.termQuery("brand", "华美达")
              )
      );
      //3.
      SearchResponse response = client.search(request, RequestOptions.DEFAULT);
      handleResponse(response);
    
    }
<a name="UHEZf"></a>
### 排序/分页
<a name="MvpS0"></a>
#### 排序
desc:降序,asc:升序<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652060923032-d2731b67-40da-4dc3-99c5-1526e80ec617.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=228&id=u59d52356&margin=%5Bobject%20Object%5D&name=image.png&originHeight=313&originWidth=837&originalType=binary&ratio=1&rotation=0&showTitle=false&size=24796&status=done&style=none&taskId=u6e2f33cb-d031-4d4c-adb0-702eb1086ff&title=&width=608.7272727272727)<br />地理坐标排序<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061014227-d410c104-1bf5-47ed-a72a-57460b7ff5df.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=321&id=u36c91781&margin=%5Bobject%20Object%5D&name=image.png&originHeight=441&originWidth=970&originalType=binary&ratio=1&rotation=0&showTitle=false&size=74646&status=done&style=none&taskId=u40edd744-a990-43f1-9199-d954a3d5533&title=&width=705.4545454545455)
<a name="yavAk"></a>
#### 分页
![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061038258-fdd1e6bf-40b6-4a6e-be92-fabc49afccca.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=228&id=u42c1f48c&margin=%5Bobject%20Object%5D&name=image.png&originHeight=314&originWidth=688&originalType=binary&ratio=1&rotation=0&showTitle=false&size=26079&status=done&style=none&taskId=u0ac8bdae-aed0-46f7-9c78-ef72ed32ee1&title=&width=500.3636363636364)<br />深度分页:(了解)<br />分页查询的常见实现方案以及优缺点:

- from + size:
   - 优点:支持随机翻页
   - 缺点:深度分页问题,默认查询上限(from + size)是10000
   - 场景:百度、京东、谷歌、淘宝这样的随机翻页搜索
- after search:
   - 优点:没有查询上限(单次查询的size不超过10000)
   - 缺点:只能向后逐页查询,不支持随机翻页
   - 场景:没有随机翻页需求的搜索,例如手机向下滚动翻页
- scroll:
   - 优点:没有查询上限(单次查询的size不超过10000)
   - 缺点:会有额外内存消耗,并且搜索结果是非实时的
   - 场景:海量数据的获取和迁移。从ES7.1开始不推荐,建议用 after search方案。
<a name="d2Or0"></a>
### 高亮处理

- 高亮是对关键字高亮,因此**搜索条件必须带有关键字**,而不能是范围这样的查询。
- 默认情况下,**高亮的字段,必须与搜索指定的字段一致**,否则无法高亮
- 如果要对非搜索字段高亮,则需要添加一个属性:required_field_match=false

![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061135052-fa4a1e48-c079-400d-a4e7-9e8c7b31f562.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=332&id=u5d3f8018&margin=%5Bobject%20Object%5D&name=image.png&originHeight=457&originWidth=1027&originalType=binary&ratio=1&rotation=0&showTitle=false&size=42186&status=done&style=none&taskId=u831d9f56-882a-4f4d-b9bb-856fd112d9d&title=&width=746.9090909090909)
<a name="PZMti"></a>
### 代码处理返回结果:层层解析
DSL返回结果一一对应代码<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061279437-296572ea-deb7-498d-a251-eeb394bfc8e7.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=455&id=ud0161a42&margin=%5Bobject%20Object%5D&name=image.png&originHeight=625&originWidth=440&originalType=binary&ratio=1&rotation=0&showTitle=false&size=43373&status=done&style=none&taskId=u909160fe-82b7-47e9-8c48-110e2510a2b&title=&width=320)
```java
private PageResult handleResponse(SearchResponse response) {
        //解析结果
        SearchHits searchHits = response.getHits();
        //总数
        long total = searchHits.getTotalHits().value;
        //集合
        SearchHit[] hits = searchHits.getHits();
        List<HotelDoc> hotelDocList = new ArrayList<>();
        for (SearchHit hit : hits) {
            //酒店对象JSON
            String hitSourceAsString = hit.getSourceAsString();
            //转换对象hotelDoc
            HotelDoc hotelDoc = JSON.parseObject(hitSourceAsString, HotelDoc.class);
            //距离
            Object[] sortValues = hit.getSortValues();
            if (sortValues != null && sortValues.length > 0) {
                hotelDoc.setDistance(sortValues[0]);
            }
            hotelDocList.add(hotelDoc);
        }
        return new PageResult(total, hotelDocList);
    }

高亮DSL:
其中的highlight跟source平级
image.png

//高亮解析
            Map<String, HighlightField> highMap = hit.getHighlightFields();
            if (!CollectionUtils.isEmpty(highMap)) {
                HighlightField name = highMap.get("name");
                if (name != null) {
                    hotelDoc.setName(name.getFragments()[0].string());
                }
            }