1、_cat

序号 命令 解释
1 GET http://localhost:9200/_cat cat包含的所有命令
2 GET http://localhost:9200/_cat/nodes 查看所有节点
3 GET http://localhost:9200/_cat/health 查看es健康状况
4 GET http://localhost:9200/_cat/master 查看主节点信息
5 GET http://localhost:9200/_cat/indices 查看所有索引

2、put/post 新增数据

post新增如果不指定id,会自动生成id。指定id就会修改当前id的数据,并新增版本号。
put必须指定id,一般用于修改操作,不指定id会报错。


3、数据修改乐观锁

_seq_no 并发控制字段,每次更新会+1
_primary_term 同上,主分片重新分配,如重启,会变化
更新携带以上两个字段,实现乐观锁机制 ?if_seq_no=0&_primary_term=1
更新前查询数据,利用 _seq_no 和 _primary_term 去更新数据,如果同时两个操作都在修改同一条数据,当其中一个操作执行完会更新 _seq_no 和 _primary_term 此时,另一个更新操作就会失败,返回status:409错误。


4、Query DSL

4.1 _bulk 批量导入

数据来源官方提供https://raw.githubusercontent.com/elastic/elasticsearch/master/docs/src/test/resources/accounts.json

  1. curl -X POST "localhost:9200/bank/_bulk?pretty&refresh" -H 'Content-Type: application/json' -d'
  2. {"index":{"_id":"1"}}
  3. {"index":{"_id":"6"}}
  4. {"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
  5. {"index":{"_id":"13"}}

4.2 match_all 匹配所有文档

最简单的查询,它匹配所有文档,给出所有文档的分数为1.0
_source:自定义返回数据的字段

  1. curl -XGET "http://localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
  2. {
  3. "query": {
  4. "match_all": {}
  5. },
  6. "sort": [
  7. {
  8. "balance": {
  9. "order": "desc"
  10. }
  11. }
  12. ],
  13. "from": 0,
  14. "size": 20,
  15. "_source": ["balance","firstname"]
  16. }

入门elasticsearch,这篇就够了! - 图1

4.3 match 全文检索

检索字符类型数据为全文检索(模糊查询,空格分词,查询mill不会查询出miller数据),检索数值型数据则为精确匹配

  1. curl -XGET "http://localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
  2. {
  3. "query": {
  4. "match": {
  5. "address": "Miller"
  6. }
  7. }
  8. }

入门elasticsearch,这篇就够了! - 图2

4.4 match_phrase 短语匹配

不分词

短语匹配 "address": "Miller Place"

  1. GET bank/_search
  2. {
  3. "query": {
  4. "match_phrase": {
  5. "address": "Miller Place"
  6. }
  7. }
  8. }

入门elasticsearch,这篇就够了! - 图3

不分词匹配还有字段名.keyword作为查询字段来防止分词

  1. GET bank/_search
  2. {
  3. "query": {
  4. "match": {
  5. "address.keyword": "666 Miller Place"
  6. }
  7. }
  8. }

入门elasticsearch,这篇就够了! - 图4

matchphrase 和 *.keyword的区别
match_phrase 的值是整个短语不会被分词,但是会类似%查询值%查询
.keyword查询相当于 _=查询值

4.5 multi_match 多字段匹配

会进行分词
入门elasticsearch,这篇就够了! - 图5

4.6 bool 复合查询

合并多个查询条件
must:必须满足,满足的得分高
must_not :必须不满足,不会提高文档得分
should :最好满足,不满足也可以,满足的得分高

  1. ##复合查询
  2. GET bank/_search
  3. {
  4. "query": {
  5. "bool": {
  6. "must": [
  7. {"match": {
  8. "gender":"M"
  9. }},
  10. {"match": {
  11. "address": "mill"
  12. }}
  13. ],
  14. "must_not": [
  15. {"match": {
  16. "age": "18"
  17. }}
  18. ],
  19. "should": [
  20. {"match": {
  21. "lastname": "Wallace"
  22. }}
  23. ]
  24. }
  25. }
  26. }

入门elasticsearch,这篇就够了! - 图6

4.7 bool-filter 过滤查询

filter条件满足的不会提高数据得分,可用于不参与得分的字段查询,或者对查询出来的数据进行过滤,必须与复合查询一起使用,同级别的有must、must_not、should 都是复合查询中的查询规则。
层级关系:
bool -> must、must_not、should、
filter -> match、match_all、multi_match、match_phrase

  1. ##过滤查询
  2. GET bank/_search
  3. {
  4. "query": {
  5. "bool": {
  6. "filter": {"range": {
  7. "age": {
  8. "gte": 10,
  9. "lte": 20
  10. }
  11. }}
  12. }
  13. }
  14. }

入门elasticsearch,这篇就够了! - 图7

4.8 term查询

和match一样,匹配某个属性的值。全文检索字段用match,非text字段匹配用term。【规范化查询】
入门elasticsearch,这篇就够了! - 图8

4.9 aggregations 聚合

数据分组,相当于group by,可查询多组聚合,平均值等
只查看聚合结果,不查看查询数据,则添加 “size”:0

  1. GET bank/_search
  2. {
  3. "query": {
  4. "match": {
  5. "address": "Miller"
  6. }
  7. },
  8. "aggs": {
  9. "ageAgg": {
  10. "terms": {
  11. "field": "age",
  12. "size": 10
  13. }
  14. },
  15. "ageAvg":{
  16. "avg": {
  17. "field": "age"
  18. }
  19. },
  20. "balanceAvg":{
  21. "avg": {
  22. "field": "balance"
  23. }
  24. }
  25. },
  26. "size":0
  27. }

入门elasticsearch,这篇就够了! - 图9

多级聚合

  1. ##按照年龄聚合,并查询各个年龄段的平均工资
  2. GET bank/_search
  3. {
  4. "query": {
  5. "match_all": {}
  6. },
  7. "aggs": {
  8. "ageAgg": {
  9. "terms": {
  10. "field": "age"
  11. },
  12. "aggs": {
  13. "ageAvg": {
  14. "avg": {
  15. "field": "balance"
  16. }
  17. }
  18. }
  19. }
  20. },
  21. "size": 0
  22. }

入门elasticsearch,这篇就够了! - 图10

  1. ##按照年龄分布,并且这些年龄段中M的平均薪资和F的平均薪资以及这个年龄段的总体平均薪资
  2. GET bank/_search
  3. {
  4. "query": {
  5. "match_all": {}
  6. },
  7. "aggs": {
  8. "terms": {
  9. "field": "age"
  10. },
  11. "aggs": {
  12. "genderAgg": {
  13. "terms": {
  14. "field": "gender.keyword"
  15. },
  16. "aggs": {
  17. "balanceAvg": {
  18. "avg": {
  19. "field": "balance"
  20. }
  21. }
  22. }
  23. },
  24. "totalAgg":{
  25. "avg": {
  26. "field": "balance"
  27. }
  28. }
  29. }
  30. }
  31. },
  32. "size": 0
  33. }

入门elasticsearch,这篇就够了! - 图11


5、mapping

es导入数据会自动根据数据创建对应的数据类型,大概为:
文本 -> text
数值 -> long
ES常用的数据类型有:https://www.elastic.co/guide/en/elasticsearch/reference/7.4/mapping-types.html

string text and keyword Numeric long, integer, short, byte, double,
float, half_float, scaled_float Date date Boolean boolean Object
object for single JSON objects Nested nested for arrays of JSON
objects IP ip for IPv4 and IPv6 addresses

##keyword 表示该字段不分词,lastname类型为text表示分词,但fields内的keyword的类型为keyword,则查询lastname.keyword 不分词

  1. PUT /my-index
  2. {
  3. "mappings": {
  4. "properties": {
  5. "age": { "type": "integer" },
  6. "email": { "type": "keyword" },
  7. "name": { "type": "text" },
  8. "lastname" : {
  9. "type" : "text",
  10. "fields" : {
  11. "keyword" : {
  12. "type" : "keyword",
  13. "ignore_above" : 256
  14. }
  15. }
  16. }
  17. }
  18. }
  19. }
  1. "type" : "text",

查询索引mapping
GET my-index/_mapping


6、添加新的映射字段

index 索引选项控制字段值是否被索引。它接受真或假,默认为真。未编入索引的字段不可查询。

  1. PUT /my-index/_mapping
  2. {
  3. "properties": {
  4. "employee-id": {
  5. "type": "text",
  6. "index": false
  7. }
  8. }
  9. }

入门elasticsearch,这篇就够了! - 图12


7、ik分词器

安装
https://github.com/medcl/elasticsearch-analysis-ik/releases
找到对应版本下载,解压到elasticsearch-7.4.2/plugins下,可以重命名ik,启动es即可
检测是否安装成功
http://localhost:9200/_cat/plugins
检测分词效果

  1. GET _analyze
  2. {
  3. "analyzer": "ik_smart",
  4. "text": "我是中国人"
  5. }

入门elasticsearch,这篇就够了! - 图13
热更新 IK 分词使用方法
目前该插件支持热更新 IK 分词,通过上文在 IK 配置文件中提到的如下配置

  1. <!--用户可以在这里配置远程扩展字典 -->
  2. <entry key="remote_ext_dict">location</entry>
  3. <!--用户可以在这里配置远程扩展停止词字典-->
  4. <entry key="remote_ext_stopwords">location</entry>

其中 location 是指一个 url,比如 http://yoursite.com/getCustomDict,该请求只需满足以下两点即可完成分词热更新。
该 http 请求需要返回两个头部(header),一个是 Last-Modified,一个是 ETag,这两者都是字符串类型,只要有一个发生变化,该插件就会去抓取新的分词进而更新词库。
该 http 请求返回的内容格式是一行一个分词,换行符用 \n 即可。
满足上面两点要求就可以实现热更新分词了,不需要重启 ES 实例。
可以将需自动更新的热词放在一个 UTF-8 编码的 .txt 文件里,放在 nginx 或其他简易 http server 下,当 .txt 文件修改时,http server 会在客户端请求该文件时自动返回相应的 Last-Modified 和 ETag。可以另外做一个工具来从业务系统提取相关词汇,并更新这个 .txt 文件。

QAQ:

自定义词典为什么没有生效? 请确保你的扩展词典的文本格式为 UTF8 编码 ik_max_word 和 ik_smart 什么区别?
ik_max_word:
会将文本做最细粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,中华人民,中华,华人,人民共和国,人民,人,民,共和国,共和,和,国国,国歌”,会穷尽各种可能的组合,适合
Term Query(字符型分词查询); ik_smart:
会做最粗粒度的拆分,比如会将“中华人民共和国国歌”拆分为“中华人民共和国,国歌”,适合 Phrase 查询(短语查询,不分词)。


8、springboot整合high-level-client

  • pom引用
  1. <dependency>
  2. <groupId>org.elasticsearch.client</groupId>
  3. <artifactId>elasticsearch-rest-high-level-client</artifactId>
  4. <version>7.4.2</version>
  5. </dependency>
  • 初始化client,并加载到spring容器内
  1. @Configuration
  2. public class ElasticSearchClient {
  3. @Bean
  4. public RestHighLevelClient esRestClient() {
  5. RestHighLevelClient client = new RestHighLevelClient(
  6. RestClient.builder(new HttpHost("localhost", 9200, "http")));
  7. return client;
  8. }
  9. }

9、保存数据

1、配置文件添加RequestOptions

  1. @Configuration
  2. public class ElasticSearchClient {
  3. public static final RequestOptions COMMON_OPTIONS;
  4. static {
  5. RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder();
  6. // builder.addHeader("Authorization", "Bearer " + TOKEN);
  7. // builder.setHttpAsyncResponseConsumerFactory(
  8. // new HttpAsyncResponseConsumerFactory
  9. // .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024));
  10. COMMON_OPTIONS = builder.build();
  11. }
  12. @Bean
  13. public RestHighLevelClient esRestClient() {
  14. RestHighLevelClient client = new RestHighLevelClient(
  15. RestClient.builder(new HttpHost("localhost", 9200, "http")));
  16. return client;
  17. }
  18. }

2、添加数据
https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.4/java-rest-high-document-index.html

  1. @Autowired
  2. private RestHighLevelClient restClient;
  3. @Test
  4. public void index() throws IOException {
  5. IndexRequest request = new IndexRequest("users");
  6. request.id("1");
  7. User user = new User();
  8. user.setAge(18);
  9. user.setName("wanter");
  10. user.setGender("男");
  11. String jsonString = JSON.toJSONString(user);
  12. request.source(jsonString, XContentType.JSON);
  13. //同步执行
  14. IndexResponse indexResponse = restClient.index(request, RequestOptions.DEFAULT);
  15. System.out.println(indexResponse);
  16. }

输出:IndexResponse[index=users,type=_doc,id=1,version=1,result=created,seqNo=0,primaryTerm=1,shards={“total”:2,”successful”:1,”failed”:0}]


10、检索及解析

https://www.elastic.co/guide/en/elasticsearch/client/java-rest/7.4/java-rest-high-search.html
DSL:address匹配mill,对结果年龄分组,不同年龄组统计性别和平均薪资

DSL:

  1. {
  2. "size": 10,
  3. "query": {
  4. "match": {
  5. "address": {
  6. "query": "mill",
  7. "operator": "OR",
  8. "prefix_length": 0,
  9. "max_expansions": 50,
  10. "fuzzy_transpositions": true,
  11. "lenient": false,
  12. "zero_terms_query": "NONE",
  13. "auto_generate_synonyms_phrase_query": true,
  14. "boost": 1
  15. }
  16. }
  17. },
  18. "aggregations": {
  19. "ageAgg": {
  20. "terms": {
  21. "field": "age",
  22. "size": 10,
  23. "min_doc_count": 1,
  24. "shard_min_doc_count": 0,
  25. "show_term_doc_count_error": false,
  26. "order": [
  27. {
  28. "count": "desc"
  29. },
  30. {
  31. "key": "asc"
  32. }
  33. ]
  34. },
  35. "aggregations": {
  36. "genderAgg": {
  37. "terms": {
  38. "field": "gender.keyword",
  39. "size": 10,
  40. "min_doc_count": 1,
  41. "shard_min_doc_count": 0,
  42. "show_term_doc_count_error": false,
  43. "order": [
  44. {
  45. "count": "desc"
  46. },
  47. {
  48. "key": "asc"
  49. }
  50. ]
  51. }
  52. },
  53. "totalAgg": {
  54. "avg": {
  55. "field": "balance"
  56. }
  57. }
  58. }
  59. }
  60. }
  61. }

结果:

  1. {
  2. "took": 1,
  3. "timed_out": false,
  4. "shards": {
  5. "total": 1,
  6. "successful": 1,
  7. "skipped": 0,
  8. "failed": 0
  9. },
  10. "hits": {
  11. "total": {
  12. "value": 4,
  13. "relation": "eq"
  14. },
  15. "max_score": 5.4032025,
  16. "hits": [
  17. {
  18. "index": "bank",
  19. "type": "doc",
  20. "id": "970",
  21. "score": 5.4032025,
  22. "source": {
  23. "account_number": 970,
  24. "balance": 19648,
  25. "firstname": "Forbes",
  26. "lastname": "Wallace",
  27. "age": 28,
  28. "gender": "M",
  29. "address": "990 Mill Road",
  30. "employer": "Pheast",
  31. "email": "forbeswallace@pheast.com",
  32. "city": "Lopezo",
  33. "state": "AK"
  34. }
  35. },
  36. {
  37. "index": "bank",
  38. "type": "doc",
  39. "id": "136",
  40. "score": 5.4032025,
  41. "source": {
  42. "account_number": 136,
  43. "balance": 45801,
  44. "firstname": "Winnie",
  45. "lastname": "Holland",
  46. "age": 38,
  47. "gender": "M",
  48. "address": "198 Mill Lane",
  49. "employer": "Neteria",
  50. "email": "winnieholland@neteria.com",
  51. "city": "Urie",
  52. "state": "IL"
  53. }
  54. },
  55. {
  56. "index": "bank",
  57. "type": "doc",
  58. "id": "345",
  59. "score": 5.4032025,
  60. "source": {
  61. "account_number": 345,
  62. "balance": 9812,
  63. "firstname": "Parker",
  64. "lastname": "Hines",
  65. "age": 38,
  66. "gender": "M",
  67. "address": "715 Mill Avenue",
  68. "employer": "Baluba",
  69. "email": "parkerhines@baluba.com",
  70. "city": "Blackgum",
  71. "state": "KY"
  72. }
  73. },
  74. {
  75. "index": "bank",
  76. "type": "doc",
  77. "id": "472",
  78. "score": 5.4032025,
  79. "_source": {
  80. "account_number": 472,
  81. "balance": 25571,
  82. "firstname": "Lee",
  83. "lastname": "Long",
  84. "age": 32,
  85. "gender": "F",
  86. "address": "288 Mill Street",
  87. "employer": "Comverges",
  88. "email": "leelong@comverges.com",
  89. "city": "Movico",
  90. "state": "MT"
  91. }
  92. }
  93. ]
  94. },
  95. "aggregations": {
  96. "lterms#ageAgg": {
  97. "doc_count_error_upper_bound": 0,
  98. "sum_other_doc_count": 0,
  99. "buckets": [
  100. {
  101. "key": 38,
  102. "doc_count": 2,
  103. "sterms#genderAgg": {
  104. "doc_count_error_upper_bound": 0,
  105. "sum_other_doc_count": 0,
  106. "buckets": [
  107. {
  108. "key": "M",
  109. "doc_count": 2
  110. }
  111. ]
  112. },
  113. "avg#totalAgg": {
  114. "value": 27806.5
  115. }
  116. },
  117. {
  118. "key": 28,
  119. "doc_count": 1,
  120. "sterms#genderAgg": {
  121. "doc_count_error_upper_bound": 0,
  122. "sum_other_doc_count": 0,
  123. "buckets": [
  124. {
  125. "key": "M",
  126. "doc_count": 1
  127. }
  128. ]
  129. },
  130. "avg#totalAgg": {
  131. "value": 19648
  132. }
  133. },
  134. {
  135. "key": 32,
  136. "doc_count": 1,
  137. "sterms#genderAgg": {
  138. "doc_count_error_upper_bound": 0,
  139. "sum_other_doc_count": 0,
  140. "buckets": [
  141. {
  142. "key": "F",
  143. "doc_count": 1
  144. }
  145. ]
  146. },
  147. "avg#totalAgg": {
  148. "value": 25571
  149. }
  150. }
  151. ]
  152. }
  153. }
  154. }
  1. @Autowired
  2. private RestHighLevelClient restClient;
  3. @Test
  4. public void search() throws IOException {
  5. SearchRequest searchRequest = new SearchRequest();
  6. //设置索引
  7. searchRequest.indices("bank");
  8. SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
  9. //构造条件
  10. sourceBuilder.query(QueryBuilders.matchQuery("address","mill"));
  11. TermsAggregationBuilder ageAgg = AggregationBuilders.terms("ageAgg").field("age");
  12. ageAgg.subAggregation(AggregationBuilders.terms("genderAgg").field("gender.keyword"));
  13. ageAgg.subAggregation(AggregationBuilders.avg("totalAgg").field("balance"));
  14. sourceBuilder.aggregation(ageAgg);
  15. sourceBuilder.size(10);
  16. searchRequest.source(sourceBuilder);
  17. System.out.println(sourceBuilder.toString());
  18. //执行查询
  19. SearchResponse searchResponse = restClient.search(searchRequest, RequestOptions.DEFAULT);
  20. System.out.println(searchResponse.toString());
  21. //获取数据并映射到对象中解析
  22. SearchHits hits = searchResponse.getHits();
  23. SearchHit[] searchHits = hits.getHits();
  24. for (SearchHit searchHit : searchHits) {
  25. String index = searchHit.getIndex();
  26. String sourceString = searchHit.getSourceAsString();
  27. Account account = JSON.parseObject(sourceString, Account.class);
  28. System.out.println(account);
  29. }
  30. //获取统计信息解析
  31. Aggregations aggregations = searchResponse.getAggregations();
  32. Terms ageAgg1 = aggregations.get("ageAgg");
  33. for (Terms.Bucket bucket : ageAgg1.getBuckets()) {
  34. System.out.println("年龄:"+bucket.getKey()+"==>数量:"+bucket.getDocCount());
  35. Aggregations aggregations1 = bucket.getAggregations();
  36. Terms genderAgg = aggregations1.get("genderAgg");
  37. for (Terms.Bucket genderAggBucket : genderAgg.getBuckets()) {
  38. System.out.println("性别:"+genderAggBucket.getKey()+"==>数量:"+genderAggBucket.getDocCount());
  39. }
  40. Avg totalAgg = aggregations1.get("totalAgg");
  41. System.out.println("年龄段平均薪资:"+totalAgg.getValue());
  42. }
  43. }

Account(account_number=970, balance=19648, firstname=Forbes, lastname=Wallace, age=28, gender=M, address=990 Mill Road, employer=Pheast, email=forbeswallace@pheast.com, city=Lopezo, state=AK)
Account(account_number=136, balance=45801, firstname=Winnie, lastname=Holland, age=38, gender=M, address=198 Mill Lane, employer=Neteria, email=winnieholland@neteria.com, city=Urie, state=IL)
Account(account_number=345, balance=9812, firstname=Parker, lastname=Hines, age=38, gender=M, address=715 Mill Avenue, employer=Baluba, email=parkerhines@baluba.com, city=Blackgum, state=KY)
Account(account_number=472, balance=25571, firstname=Lee, lastname=Long, age=32, gender=F, address=288 Mill Street, employer=Comverges, email=leelong@comverges.com, city=Movico, state=MT)
年龄:38==>数量:2
性别:M==>数量:2
年龄段平均薪资:27806.5
年龄:28==>数量:1
性别:M==>数量:1
年龄段平均薪资:19648.0
年龄:32==>数量:1
性别:F==>数量:1
年龄段平均薪资:25571.0


ad9f164b-c04c-44ed-94f3-a5f838daae02.jpg