概览

  1. 主要分两种
  2. - URI Search
  3. 示例: curl -XGET "http://localhost:9200/test_index/_search?q=cusomer"
  4. - Request Body Search
  5. 示例: curl -XGET "http://localhost:9200/test_index/_search" -H 'Content-Type: application/json' -d
  6. '{
  7. "query":{
  8. "match_all":{}
  9. }
  10. }'

URI Search

  1. 示例
  2. GET /movies/_search?q=2012&df=title&sort=year:desc&from=0&size=10&timeout=1s
  3. {
  4. "profile":true
  5. }
  6. - q 指定查询语句, 使用Query String Syntax
  7. - df 默认字段, 不指定会对所有字段进行查询, q=2012&df=title 等价于 q=title:2012
  8. - sort 排序, from size 分页
  9. - profile 可以查看查询是如何被执行的
  10. 具体不同查询类型操作,

Request Body Search

  1. 示例
  2. POST /movies/_search
  3. {
  4. "script_fileds":{ // 脚本字段
  5. "new_filed":{
  6. "script":{
  7. "lang":"painless",
  8. "source":"doc['order_date'].value+'hello'"
  9. }
  10. }
  11. },
  12. "_source":["order_date","category.keyword"], // 指定返回的字段
  13. "sort":[{"order_date":"desc"}],
  14. "from":10,
  15. "size":20,
  16. "query":{
  17. "match_all":{}
  18. }
  19. }
  20. - 使用查询表达式
  21. POST movies/_search
  22. {
  23. "query": {
  24. "match": {
  25. "title": {
  26. "query": "last christmas",
  27. "operator": "and"
  28. }
  29. }
  30. }
  31. }
  32. POST movies/_search
  33. {
  34. "query": {
  35. "match_phrase": {
  36. "title":{
  37. "query": "one love",
  38. "slop": 1
  39. }
  40. }
  41. }
  42. }

Query String & Simple Query String

简单来说后者禁止了一些高级查询并忽略了一些语法错误

Mapping

Mapping类似数据库的schema定义, 作用:

  • 定义索引中字段名称
  • 定义字段数据类型
  • 定义字段的倒排索引的相关配置, 如analyzed, analyzer

字段数据类型:

  • 简单类型
    • Text/Keyword
    • Date
    • Integer/Floating
    • Boolean
    • IPv4/IPv6
  • 复杂类型-对象和嵌套对象
  • 特殊类型
    • geo_point&geo_shape/percolator

Dynamic Mapping

写入文档时, 如果索引不存在, 则会自动创建索引, 此时es会根据文档信息, 推断出字段类型
image.png

  1. 示例:
  2. # dynamic mapping,推断字段的类型
  3. PUT mapping_test/_doc/1
  4. {
  5. "uid" : "123",
  6. "isVip" : false,
  7. "isAdmin": "true",
  8. "age":19,
  9. "heigh":180
  10. }
  11. # 查看 Dynamic
  12. GET mapping_test/_mapping
  13. 从下面结果可以看到uid,isAdmin只会被识别为字符串, 被设置为Text, 并添加了keyword子字段
  14. {
  15. "mapping_test" : {
  16. "mappings" : {
  17. "properties" : {
  18. "age" : {
  19. "type" : "long"
  20. },
  21. "heigh" : {
  22. "type" : "long"
  23. },
  24. "isAdmin" : {
  25. "type" : "text",
  26. "fields" : {
  27. "keyword" : {
  28. "type" : "keyword",
  29. "ignore_above" : 256
  30. }
  31. }
  32. },
  33. "isVip" : {
  34. "type" : "boolean"
  35. },
  36. "uid" : {
  37. "type" : "text",
  38. "fields" : {
  39. "keyword" : {
  40. "type" : "keyword",
  41. "ignore_above" : 256
  42. }
  43. }
  44. }
  45. }
  46. }
  47. }
  48. }

Dynamic Mapping配置
image.png

  1. #修改为dynamic false
  2. PUT dynamic_mapping_test/_mapping
  3. {
  4. "dynamic": "false"
  5. }

显式定义Mapping

  1. 示例:
  2. PUT movies
  3. {
  4. "mappings":{...}
  5. }
  6. 初建时可以基于dynamic mapping进行修改

常见参数:

  • index, 控制当前字段是否被索引, 默认为true
  • index options, 控制倒排索引记录的内容, text默认position, 其他默认docs
    • docs, 记录doc id
    • freqs, 记录doc id/term frequencies
    • positions. 记录doc id/term frequencies/term position
    • offsets, 记录doc id/term frequencies/term position/character offects
  • null_value, 需要对Null值实现搜索, 只有keyword类型支持设定null_value
  • copy_to, 字段拷贝 ```json

    设置 index 为 false

    PUT users { “mappings” : {
    1. "properties" : {
    2. "firstName" : {
    3. "type" : "text"
    4. },
    5. "lastName" : {
    6. "type" : "text"
    7. },
    8. "mobile" : {
    9. "type" : "text",
    10. "index": false
    11. }
    12. }
    } }

设定Null_value

PUT users { “mappings” : { “properties” : { “firstName” : { “type” : “text” }, “lastName” : { “type” : “text” }, “mobile” : { “type” : “keyword”, “null_value”: “NULL” }

  1. }
  2. }

}

设置 Copy to

PUT users { “mappings”: { “properties”: { “firstName”:{ “type”: “text”, “copy_to”: “fullName” }, “lastName”:{ “type”: “text”, “copy_to”: “fullName” } } } }

PUT users/_doc/1 { “firstName”:”Ruan”, “lastName”: “Yiming” }

GET users/_search?q=fullName:(Ruan Yiming)

POST users/_search { “query”: { “match”: { “fullName”:{ “query”: “Ruan Yiming”, “operator”: “and” } } } }

  1. <a name="NjWSn"></a>
  2. ### 多字段特性
  3. - mapping中的keyword指的是不需要分词的精确值, text是会被分词的.
  4. - text类型字段自动生成mapping时, 会自动添加一个keyword子字段供精确检索
  5. - 此外, 还可以添加自定义子字段, 采用自定义分词器, 以满足不同条件下的搜索
  6. 自定义分词补充
  7. 1. Character Filters, 提前进行文本处理, 如增加删除以及替换字符, 可配置多个
  8. 自带的Character Filters:
  9. - HTML strip, 去除html标签
  10. - Mapping, 字符串替换
  11. - Pattern replace, 正则匹配替换
  12. 2. Tokenizer, 将原始文本按照一定的规则, 切分为词
  13. es内置的Tokenizers:<br />whitespace/standard/uax_url_email/pattern/keyword/path hierarchy
  14. 3. Token Filters, 将Tokenizer输出的单词进行增加修改删除
  15. 自带的Token Filters:<br />Lowercase/stop/synonym(添加近义词)
  16. ```json
  17. PUT logs/_doc/1
  18. {"level":"DEBUG"}
  19. GET /logs/_mapping
  20. POST _analyze
  21. {
  22. "tokenizer":"keyword",
  23. "char_filter":["html_strip"],
  24. "text": "<b>hello world</b>"
  25. }
  26. POST _analyze
  27. {
  28. "tokenizer":"path_hierarchy",
  29. "text":"/user/ymruan/a/b/c/d/e"
  30. }
  31. #使用char filter进行替换
  32. POST _analyze
  33. {
  34. "tokenizer": "standard",
  35. "char_filter": [
  36. {
  37. "type" : "mapping",
  38. "mappings" : [ "- => _"]
  39. }
  40. ],
  41. "text": "123-456, I-test! test-990 650-555-1234"
  42. }
  43. //char filter 替换表情符号
  44. POST _analyze
  45. {
  46. "tokenizer": "standard",
  47. "char_filter": [
  48. {
  49. "type" : "mapping",
  50. "mappings" : [ ":) => happy", ":( => sad"]
  51. }
  52. ],
  53. "text": ["I am felling :)", "Feeling :( today"]
  54. }
  55. // white space and snowball
  56. GET _analyze
  57. {
  58. "tokenizer": "whitespace",
  59. "filter": ["stop","snowball"],
  60. "text": ["The gilrs in China are playing this game!"]
  61. }
  62. // whitespace与stop
  63. GET _analyze
  64. {
  65. "tokenizer": "whitespace",
  66. "filter": ["stop","snowball"],
  67. "text": ["The rain in Spain falls mainly on the plain."]
  68. }
  69. //remove 加入lowercase后,The被当成 stopword删除
  70. GET _analyze
  71. {
  72. "tokenizer": "whitespace",
  73. "filter": ["lowercase","stop","snowball"],
  74. "text": ["The gilrs in China are playing this game!"]
  75. }
  76. //正则表达式
  77. GET _analyze
  78. {
  79. "tokenizer": "standard",
  80. "char_filter": [
  81. {
  82. "type" : "pattern_replace",
  83. "pattern" : "http://(.*)",
  84. "replacement" : "$1"
  85. }
  86. ],
  87. "text" : "http://www.elastic.co"
  88. }

Index Template & Dynamic Template

Index Template

索引模板, 可以设定一个模板固化mapping和setting, 并按照一定规则, 自动匹配到新创建的索引上

  • 模板仅在索引新建时有用, 修改模板不会影响已有的索引
  • 可以设定多个索引模板, 指定order, 多个设置会按规则merge
  • 优先级为: 用户指定 > order高的Index Template > order底的index Template ```json

    Create a default template

    PUT _template/template_default { “index_patterns”: [“*”], “order” : 0, “version”: 1, “settings”: { “number_of_shards”: 1, “number_of_replicas”:1 } }

PUT /_template/template_test { “index_patterns” : [“test*”], “order” : 1, “settings” : { “number_of_shards”: 1, “number_of_replicas” : 2 }, “mappings” : { “date_detection”: false, “numeric_detection”: true } }

查看template信息

GET /_template/template_default GET /_template/temp*

DELETE /_template/template_default DELETE /_template/template_test

  1. <a name="xfObH"></a>
  2. ### Dynamic Template
  3. 应用在某一个具体的索引上, 可以自定义一些字段类型推断的规则, 如:
  4. - is开头的字段都设置成boolean
  5. ```json
  6. #Dynaminc Mapping 根据类型和字段名
  7. DELETE my_index
  8. PUT my_index/_doc/1
  9. {
  10. "firstName":"Ruan",
  11. "isVIP":"true"
  12. }
  13. GET my_index/_mapping
  14. DELETE my_index
  15. #示例, 字符串设置成boolean, 字符串设置成keyword
  16. PUT my_index
  17. {
  18. "mappings": {
  19. "dynamic_templates": [
  20. {
  21. "strings_as_boolean": {
  22. "match_mapping_type": "string",
  23. "match":"is*",
  24. "mapping": {
  25. "type": "boolean"
  26. }
  27. }
  28. },
  29. {
  30. "strings_as_keywords": {
  31. "match_mapping_type": "string",
  32. "mapping": {
  33. "type": "keyword"
  34. }
  35. }
  36. }
  37. ]
  38. }
  39. }
  40. DELETE my_index
  41. #示例, 结合路径, 组合姓名
  42. PUT my_index
  43. {
  44. "mappings": {
  45. "dynamic_templates": [
  46. {
  47. "full_name": {
  48. "path_match": "name.*",
  49. "path_unmatch": "*.middle",
  50. "mapping": {
  51. "type": "text",
  52. "copy_to": "full_name"
  53. }
  54. }
  55. }
  56. ]
  57. }
  58. }
  59. PUT my_index/_doc/1
  60. {
  61. "name": {
  62. "first": "John",
  63. "middle": "Winston",
  64. "last": "Lennon"
  65. }
  66. }
  67. GET my_index/_search?q=full_name:John

聚合分析

es的聚合(aggregation)是对数据进行统计分析的功能
image.png

image.png

  1. Elasticsearch聚合分析简介
  2. 课程Demo
  3. 需要通过Kibana导入Sample Data的飞机航班数据。具体参考“2.2节-Kibana的安装与界面快速浏览”
  4. #按照目的地进行分桶统计
  5. GET kibana_sample_data_flights/_search
  6. {
  7. "size": 0,
  8. "aggs":{
  9. "flight_dest":{
  10. "terms":{
  11. "field":"DestCountry"
  12. }
  13. }
  14. }
  15. }
  16. #查看航班目的地的统计信息,增加平均,最高最低价格
  17. GET kibana_sample_data_flights/_search
  18. {
  19. "size": 0,
  20. "aggs":{
  21. "flight_dest":{
  22. "terms":{
  23. "field":"DestCountry"
  24. },
  25. "aggs":{
  26. "avg_price":{
  27. "avg":{
  28. "field":"AvgTicketPrice"
  29. }
  30. },
  31. "max_price":{
  32. "max":{
  33. "field":"AvgTicketPrice"
  34. }
  35. },
  36. "min_price":{
  37. "min":{
  38. "field":"AvgTicketPrice"
  39. }
  40. }
  41. }
  42. }
  43. }
  44. }
  45. #价格统计信息+天气信息
  46. GET kibana_sample_data_flights/_search
  47. {
  48. "size": 0,
  49. "aggs":{
  50. "flight_dest":{
  51. "terms":{
  52. "field":"DestCountry"
  53. },
  54. "aggs":{
  55. "stats_price":{
  56. "stats":{
  57. "field":"AvgTicketPrice"
  58. }
  59. },
  60. "wather":{
  61. "terms": {
  62. "field": "DestWeather",
  63. "size": 5
  64. }
  65. }
  66. }
  67. }
  68. }
  69. }
  70. 相关阅读
  71. https://www.elastic.co/guide/en/elasticsearch/reference/7.1/search-aggregations.html