ES对比mysql数据库:索引表就是表,每个文档就是mysql的每一行,每个列名就是field,也就是一个字段。
常用查询DSL及对应的RestClient:
//测试
private RestHighLevelClient client;
@BeforeEach
void setUp() {
this.client = new RestHighLevelClient(RestClient.builder(
HttpHost.create("http://121.36.164.132:9200")
));
}
@AfterEach
void tearDown() throws IOException {
this.client.close();
}
//或者在启动类添加Bean
@Bean
public RestHighLevelClient client(){
RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(
HttpHost.create("http://121.36.164.132:9200")
));
return client;
}
基本查询:
查询所有:match_all—无条件查询
对应java代码:
@Test
void testMatchAll() throws IOException {
//1.创建请求
SearchRequest request = new SearchRequest("hotel");
//2.组织条件
request.source().query(QueryBuilders.matchAllQuery());
//3.发送请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析响应
handleResponse(response);
}
单字段查询:match——只能有一个列
@Test
void testMatch() throws IOException {
//1.
SearchRequest request = new SearchRequest("hotel");
//2.
request.source().query(QueryBuilders.matchQuery
("all", "外滩"));
//3.
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.
handleResponse(response);
}
多字段查询:multi_match
@Test
void testMultiMatch() throws IOException {
//1.
SearchRequest request = new SearchRequest("hotel");
//2.
request.source().query(QueryBuilders.multiMatchQuery(
"外滩如家", "name", "business"
));
//3.请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//4.解析
handleResponse(response);
}
精准查询:
term查询的字段一定是keyword,因为是精准查询不能分词
range范围查询需要用到:# gte 大于等于 gt 大于 lte 小于等于 lt 小于
@Test
void testJingZhun() throws IOException {
//1.
SearchRequest request = new SearchRequest("hotel");
//2.
//term
// request.source().query(QueryBuilders.termQuery
// ("city","上海"));
//range
request.source().query(QueryBuilders.rangeQuery("price")
.gte(1000).lte(2000));
//请求
SearchResponse response = client.search(request, RequestOptions.DEFAULT);
//解析
handleResponse(response);
}
地理坐标查询:
// geo_bounding_box查询
GET /indexName/_search
{
"query": {
"geo_bounding_box": {
"FIELD": {
"top_left": { // 左上点
"lat": 31.1,
"lon": 121.5
},
"bottom_right": { // 右下点
"lat": 30.9,
"lon": 121.7
}
}
}
}
}
附件搜索:以点为圆心搜索——常用——geo_distance
//距离排序
String location = params.getLocation();
if (location != null && !location.equals("")) {
request.source().sort(SortBuilders
.geoDistanceSort("location", new GeoPoint(location))
.order(SortOrder.ASC)
.unit(DistanceUnit.KILOMETERS)
);
}
复合查询:(function_score/bool)
function_score:——条件只能是filter
- 原始查询条件:query部分,基于这个条件搜索文档,并且基于BM25算法给文档打分,原始算分(query score)
- 过滤条件:filter部分,符合该条件的文档才会重新算分
- 算分函数:符合filter条件的文档要根据这个函数做运算,得到的函数算分(function score),有四种函数
- weight:函数结果是常量
- field_value_factor:以文档中的某个字段值作为函数结果
- random_score:以随机数作为函数结果
- script_score:自定义算分函数算法
- 运算模式:算分函数的结果、原始查询的相关性算分,两者之间的运算方式,包括:
- multiply:相乘
- replace:用function score替换query score
- 其它,例如:sum、avg、max、min
#fuction score——算分函数
GET /hotel/_search
{
"query": {
"function_score": {
"query": {
"term": {
"name": {
"value": "如家"
}
}
},
"functions": [
{
"filter": {
"term": {
"city": "上海"
}
},
"weight": 10
}
],
"boost_mode": "multiply"
}
}
}
//function_score
FunctionScoreQueryBuilder functionScoreQuery = QueryBuilders.functionScoreQuery(
// 原始查询,相关性算分的查询
boolQuery,
// function score的数组
new FunctionScoreQueryBuilder.FilterFunctionBuilder[]{
// 其中的一个function score 元素
new FunctionScoreQueryBuilder.FilterFunctionBuilder(
//过滤条件
QueryBuilders.termQuery("isAD", true),
//算分函数
ScoreFunctionBuilders.weightFactorFunction(10)
)
}
);
//boolQuery之前就加了其他的条件了。
function_score执行流程
1、先根据原始条件查询搜索文档,根据相关性计算得分,也就是原始算分
2、跟过过滤条件(filter)去过滤文档
3、根据过滤条件的文档,基于算分函数运算,得到函数算分
4、将原始算分和函数算分基于运算模式运算,得到最终得分,最终得分高的排前面。
Bool查询——过滤条件多种
多种条件:
- must:必须匹配每个子查询,类似“与”
- should:选择性匹配子查询,类似“或”
- must_not:必须不匹配,不参与算分,类似“非”
- filter:必须匹配,不参与算分
```java @Test void testBool() throws IOException {GET /hotel/_search { "query": { "bool": { "must": [ {"term": {"city": "上海" }} ], "should": [ {"term": {"brand": "皇冠假日" }}, {"term": {"brand": "华美达" }} ], "must_not": [ { "range": { "price": { "lte": 500 } }} ], "filter": [ { "range": {"score": { "gte": 45 } }} ] } } }
}//1 SearchRequest request = new SearchRequest("hotel"); //2. request.source().query(QueryBuilders.boolQuery().must( QueryBuilders.termQuery("name", "如家") ).mustNot( QueryBuilders.termQuery("city", "上海") ).should( QueryBuilders.termQuery("brand", "皇冠假日") ).should( QueryBuilders.termQuery("brand", "华美达") ) ); //3. SearchResponse response = client.search(request, RequestOptions.DEFAULT); handleResponse(response);
<a name="UHEZf"></a>
### 排序/分页
<a name="MvpS0"></a>
#### 排序
desc:降序,asc:升序<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652060923032-d2731b67-40da-4dc3-99c5-1526e80ec617.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=228&id=u59d52356&margin=%5Bobject%20Object%5D&name=image.png&originHeight=313&originWidth=837&originalType=binary&ratio=1&rotation=0&showTitle=false&size=24796&status=done&style=none&taskId=u6e2f33cb-d031-4d4c-adb0-702eb1086ff&title=&width=608.7272727272727)<br />地理坐标排序<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061014227-d410c104-1bf5-47ed-a72a-57460b7ff5df.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=321&id=u36c91781&margin=%5Bobject%20Object%5D&name=image.png&originHeight=441&originWidth=970&originalType=binary&ratio=1&rotation=0&showTitle=false&size=74646&status=done&style=none&taskId=u40edd744-a990-43f1-9199-d954a3d5533&title=&width=705.4545454545455)
<a name="yavAk"></a>
#### 分页
![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061038258-fdd1e6bf-40b6-4a6e-be92-fabc49afccca.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=228&id=u42c1f48c&margin=%5Bobject%20Object%5D&name=image.png&originHeight=314&originWidth=688&originalType=binary&ratio=1&rotation=0&showTitle=false&size=26079&status=done&style=none&taskId=u0ac8bdae-aed0-46f7-9c78-ef72ed32ee1&title=&width=500.3636363636364)<br />深度分页:(了解)<br />分页查询的常见实现方案以及优缺点:
- from + size:
- 优点:支持随机翻页
- 缺点:深度分页问题,默认查询上限(from + size)是10000
- 场景:百度、京东、谷歌、淘宝这样的随机翻页搜索
- after search:
- 优点:没有查询上限(单次查询的size不超过10000)
- 缺点:只能向后逐页查询,不支持随机翻页
- 场景:没有随机翻页需求的搜索,例如手机向下滚动翻页
- scroll:
- 优点:没有查询上限(单次查询的size不超过10000)
- 缺点:会有额外内存消耗,并且搜索结果是非实时的
- 场景:海量数据的获取和迁移。从ES7.1开始不推荐,建议用 after search方案。
<a name="d2Or0"></a>
### 高亮处理
- 高亮是对关键字高亮,因此**搜索条件必须带有关键字**,而不能是范围这样的查询。
- 默认情况下,**高亮的字段,必须与搜索指定的字段一致**,否则无法高亮
- 如果要对非搜索字段高亮,则需要添加一个属性:required_field_match=false
![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061135052-fa4a1e48-c079-400d-a4e7-9e8c7b31f562.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=332&id=u5d3f8018&margin=%5Bobject%20Object%5D&name=image.png&originHeight=457&originWidth=1027&originalType=binary&ratio=1&rotation=0&showTitle=false&size=42186&status=done&style=none&taskId=u831d9f56-882a-4f4d-b9bb-856fd112d9d&title=&width=746.9090909090909)
<a name="PZMti"></a>
### 代码处理返回结果:层层解析
DSL返回结果一一对应代码<br />![image.png](https://cdn.nlark.com/yuque/0/2022/png/27094793/1652061279437-296572ea-deb7-498d-a251-eeb394bfc8e7.png#clientId=u2d11328c-52a7-4&crop=0&crop=0&crop=1&crop=1&from=paste&height=455&id=ud0161a42&margin=%5Bobject%20Object%5D&name=image.png&originHeight=625&originWidth=440&originalType=binary&ratio=1&rotation=0&showTitle=false&size=43373&status=done&style=none&taskId=u909160fe-82b7-47e9-8c48-110e2510a2b&title=&width=320)
```java
private PageResult handleResponse(SearchResponse response) {
//解析结果
SearchHits searchHits = response.getHits();
//总数
long total = searchHits.getTotalHits().value;
//集合
SearchHit[] hits = searchHits.getHits();
List<HotelDoc> hotelDocList = new ArrayList<>();
for (SearchHit hit : hits) {
//酒店对象JSON
String hitSourceAsString = hit.getSourceAsString();
//转换对象hotelDoc
HotelDoc hotelDoc = JSON.parseObject(hitSourceAsString, HotelDoc.class);
//距离
Object[] sortValues = hit.getSortValues();
if (sortValues != null && sortValues.length > 0) {
hotelDoc.setDistance(sortValues[0]);
}
hotelDocList.add(hotelDoc);
}
return new PageResult(total, hotelDocList);
}
高亮DSL:
其中的highlight跟source平级
//高亮解析
Map<String, HighlightField> highMap = hit.getHighlightFields();
if (!CollectionUtils.isEmpty(highMap)) {
HighlightField name = highMap.get("name");
if (name != null) {
hotelDoc.setName(name.getFragments()[0].string());
}
}