基本操作

操作索引

1.新建索引

curl -XPUT localhost:9200/index01

2.查看索引

curl -XGET http://192.168.168.101:9200/index01/_settings
curl -XGET http://192.168.168.101:9200/index01,blog/_settings

3.删除索引

curl -XDELETE http://192.168.168.101:9200/index02

4.打开关闭索引

curl -XPOST http://192.168.168.101:9200/index01/_close
curl -XPOST http://192.168.168.101:9200/index01/_open

文档管理

1.新建文档

curl -XPUT -d ‘{‘id’:1,‘title’:‘es简介’}’ http://localhost:9200/index01/article/1

2.获取文档

curl -XGET http://192.168.168.101:9200/index01/article/1

3.删除文档

curl -XDELETE http://192.168.168.101:9200/index01/article/1

查询操作

类Lucene查询

  1. _exists_:execution_completed_time
  2. __type:company_extended_business
  3. weibo_type:18 OR weibo_type:24 OR weibo_type:25
  4. NOT company_id:442966
  5. first_consume_time:{'2019-01-03 00:00:00' TO '2019-01-03 00:00:00'}

基本查询

指定请求头

–header “content-Type:application/json”

准备数据

  1. curl -XPUT -d '{"id":1,"title":"es简介","content":"es好用好用真好用"}' http://192.168.168.101:9200/index01/article/1
  2. curl -XPUT -d '{"id":1,"title":"java编程思想","content":"这就是个工具书"}' http://192.168.168.101:9200/index01/article/2
  3. curl -XPUT -d '{"id":1,"title":"大数据简介","content":"你知道什么是大数据吗,就是大数据"}' http://192.168.168.101:9200/index01/article/3

term query

  1. curl -XGET http://192.168.168.101:9200/index01/_search -d {'query':{'term':{'title':'你好'}}

查询的字段只有一个值得时候,应该使用term而不是terms,在查询字段包含多个的时候才使用terms,使用terms语法,json中必须包含数组
match在匹配时会对所查找的关键词进行分词,然后按分词匹配查找,而term会直接对关键词进行查找。一般**模糊查找的时候,多用match,而精确查找时可以使用term

terms query

  1. {
  2. 'query':{
  3. 'terms':{
  4. 'tag':["search",'nosql','hello']
  5. }
  6. }
  7. }

match query

  1. {'query':{'match':{'title':'你好'}}}
  2. {
  3. "query": {
  4. "match": {
  5. "__type": "info"
  6. }
  7. },
  8. "sort": [
  9. {
  10. "campaign_end_time": {
  11. "order": "desc"
  12. }
  13. }
  14. ]
  15. }

match_all

  1. {'query':{'match_all':{'title':'标题一样'}}}

multi match

多值匹配查询

  1. {
  2. "query": {
  3. "multi_match": {
  4. "query": "运动 上衣",
  5. "fields": [
  6. "brandName^100",
  7. "brandName.brandName_pinyin^100",
  8. "brandName.brandName_keyword^100",
  9. "sortName^80",
  10. "sortName.sortName_pinyin^80",
  11. "productName^60",
  12. "productKeyword^20"
  13. ],
  14. "type": <multi-match-type>,
  15. "operator": "AND"
  16. }
  17. }
  18. }

Bool query

bool查询包含四个子句,must,filter,should,must_not

  1. {
  2. 'query':{
  3. 'bool':{
  4. 'must':[{
  5. 'term':{
  6. '_type':{
  7. 'value':'age'
  8. }
  9. }
  10. },{
  11. 'term':{
  12. 'account_grade':{
  13. 'value':'23'
  14. }
  15. }
  16. }
  17. ]
  18. }
  19. }
  20. }
  21. {
  22. "bool":{
  23. "must":{
  24. "term":{"user":"lucy"}
  25. },
  26. "filter":{
  27. "term":{"tag":"teach"}
  28. },
  29. "should":[
  30. {"term":{"tag":"wow"}},
  31. {"term":{"tag":"elasticsearch"}}
  32. ],
  33. "mininum_should_match":1,
  34. "boost":1.0
  35. }
  36. }

Filter query

query和filter的区别:query查询的时候,会先比较查询条件,然后计算分值,最后返回文档结果;而filter是先判断是否满足查询条件,如果不满足会缓存查询结果(记录该文档不满足结果),满足的话,就直接缓存结果
filter快在:对结果进行缓存,避免计算分值

  1. {
  2. "query": {
  3. "bool": {
  4. "must": [
  5. {"match_all": {}}
  6. ],
  7. "filter": {
  8. "range": {
  9. "create_admin_id": {
  10. "gte": 10,
  11. "lte": 20
  12. }
  13. }
  14. }
  15. }
  16. }
  17. }

range query

  1. {
  2. 'query':{
  3. 'range':{
  4. 'age':{
  5. 'gte':'30',
  6. 'lte':'20'
  7. }
  8. }
  9. }
  10. }

通配符查询

  1. {
  2. 'query':{
  3. 'wildcard':{
  4. 'title':'cr?me'
  5. }
  6. }
  7. }

正则表达式查询

  1. {
  2. 'query':{
  3. 'regex':{
  4. 'title':{
  5. 'value':'cr.m[ae]',
  6. 'boost':10.0
  7. }
  8. }
  9. }
  10. }

前缀查询

  1. {
  2. 'query':{
  3. 'match_phrase_prefix':{
  4. 'title':{
  5. 'query':'crime punish',
  6. 'slop':1
  7. }
  8. }
  9. }
  10. }

query_string

  1. {
  2. 'query':{
  3. 'query_string':{
  4. 'query':'title:crime^10 +title:punishment -otitle:cat +author:(+Fyodor +dostoevsky)'
  5. }
  6. }
  7. }

聚合查询

聚合提供了用户进行分组和数理统计的能力,可以把聚合理解成SQL中的GROUP BY和分组函数
指标聚合/桶聚合
Metrics(度量/指标):简单的对过滤出来的数据集进行avg,max操作,是一个单一的数值
Bucket(桶):将过滤出来的数据集按条件分成多个小数据集,然后Metrics会分别作用在这些小数据集上

max/min/avg/sum/stats

  1. {
  2. 'aggs':{c
  3. 'group_sum':{
  4. 'sum':{
  5. 'field':'money'
  6. }
  7. }
  8. }
  9. }
  10. {
  11. "aggs":{
  12. "avg_fees":{
  13. "avg":{
  14. "field":"fees"
  15. }
  16. }
  17. }
  18. }

terms聚合

terms根据字段值项分组聚合.field按什么字段分组,size指定返回多少个分组,shard_size指定每个分片上返回多少个分组,order排序方式.可以指定include和exclude正则筛选表达式的值,指定missing设置缺省值

  1. {
  2. 'aggs':{
  3. 'group_by_type':{
  4. 'terms':{
  5. 'field':'_type'
  6. }
  7. }
  8. }
  9. }
  10. {
  11. "size": 0,
  12. "aggs": {
  13. "terms":{
  14. "terms": {
  15. "field": "__type",
  16. "size": 10
  17. }
  18. }
  19. }
  20. }
  21. {
  22. "size": 0,
  23. "aggs": {
  24. "terms":{
  25. "terms": {
  26. "field": "__type",
  27. "size": 10,
  28. "order": {
  29. "_count": "asc"
  30. }
  31. }
  32. }
  33. }
  34. }
  35. {
  36. "size": 0,
  37. "aggs": {
  38. "agg_terms": {
  39. "terms": {
  40. "field": "cost",
  41. "order": {
  42. "_count": "asc"
  43. }
  44. },
  45. "aggs": {
  46. "max_balance": {
  47. "max": {
  48. "field": "cost"
  49. }
  50. }
  51. }
  52. }
  53. }
  54. }
  55. {
  56. "size": 0,
  57. "aggs": {
  58. "agg_terms": {
  59. "terms": {
  60. "field": "cost",
  61. "include": ".*",
  62. "exclude": ".*"
  63. }
  64. }
  65. }
  66. }

cardinality去重

  1. {
  2. "size": 0,
  3. "aggs": {
  4. "count_type": {
  5. "cardinality": {
  6. "field": "__type"
  7. }
  8. }
  9. }
  10. }
  11. cardinality

percentiles百分比

  1. percentiles对指定字段(脚本)的值按从小到大累计每个值对应的文档数的占比(占所有命中文档数的百分比),返回指定占比比例对应的值。默认返回[ 1, 5, 25, 50, 75, 95, 99 ]分位上的值
  2. {
  3. "size": 0,
  4. "aggs": {
  5. "age_percents":{
  6. "percentiles": {
  7. "field": "age",
  8. "percents": [
  9. 1,
  10. 5,
  11. 25,
  12. 50,
  13. 75,
  14. 95,
  15. 99
  16. ]
  17. }
  18. }
  19. }
  20. }
  21. {
  22. "size": 0,
  23. "aggs": {
  24. "states": {
  25. "terms": {
  26. "field": "gender"
  27. },
  28. "aggs": {
  29. "banlances": {
  30. "percentile_ranks": {
  31. "field": "balance",
  32. "values": [
  33. 20000,
  34. 40000
  35. ]
  36. }
  37. }
  38. }
  39. }
  40. }

percentiles rank

统计小于等于指定值得文档比

  1. {
  2. "size": 0,
  3. "aggs": {
  4. "tests": {
  5. "percentile_ranks": {
  6. "field": "age",
  7. "values": [
  8. 10,
  9. 15
  10. ]
  11. }
  12. }
  13. }
  14. }

filter聚合

filter对满足过滤查询的文档进行聚合计算,在查询命中的文档中选取过滤条件的文档进行聚合,先过滤在聚合

  1. {
  2. "size": 0,
  3. "aggs": {
  4. "agg_filter":{
  5. "filter": {
  6. "match":{"gender":"F"}
  7. },
  8. "aggs": {
  9. "avgs": {
  10. "avg": {
  11. "field": "age"
  12. }
  13. }
  14. }
  15. }
  16. }
  17. }

filtters聚合

多个过滤组聚合计算

  1. {
  2. "size": 0,
  3. "aggs": {
  4. "message": {
  5. "filters": {
  6. "filters": {
  7. "errors": {
  8. "exists": {
  9. "field": "__type"
  10. }
  11. },
  12. "warring":{
  13. "term": {
  14. "__type": "info"
  15. }
  16. }
  17. }
  18. }
  19. }
  20. }
  21. }

range聚合

  1. {
  2. "aggs": {
  3. "agg_range": {
  4. "range": {
  5. "field": "cost",
  6. "ranges": [
  7. {
  8. "from": 50,
  9. "to": 70
  10. },
  11. {
  12. "from": 100
  13. }
  14. ]
  15. },
  16. "aggs": {
  17. "bmax": {
  18. "max": {
  19. "field": "cost"
  20. }
  21. }
  22. }
  23. }
  24. }
  25. }

date_range聚合

  1. {
  2. "aggs": {
  3. "date_aggrs": {
  4. "date_range": {
  5. "field": "accepted_time",
  6. "format": "MM-yyy",
  7. "ranges": [
  8. {
  9. "from": "now-10d/d",
  10. "to": "now"
  11. }
  12. ]
  13. }
  14. }
  15. }
  16. }

date_histogram

时间直方图聚合,就是按天、月、年等进行聚合统计。可按 year (1y), quarter (1q), month (1M), week (1w), day (1d), hour (1h), minute (1m), second (1s) 间隔聚合或指定的时间间隔聚合

  1. {
  2. "aggs": {
  3. "sales_over_time": {
  4. "date_histogram": {
  5. "field": "accepted_time",
  6. "interval": "quarter",
  7. "min_doc_count" : 0, //可以返回没有数据的月份
  8. "extended_bounds" : { //强制返回数据的范围
  9. "min" : "2014-01-01",
  10. "max" : "2014-12-31"
  11. }
  12. }
  13. }
  14. }
  15. }

missing聚合

  1. {
  2. "aggs": {
  3. "account_missing": {
  4. "missing": {
  5. "field": "__type"
  6. }
  7. }
  8. }
  9. }

LogStash操作

启动logStash

logstash -e ‘input{stdin{}}output{stdout{codec=>rubydebug}}’

IK分词器

curl -XPOST http://192.168.168.101:9200/_analyze -d ‘{“analyzer”:“ik”,“text”:“JAVA编程思想”}’
http://192.168.168.101:9200/index01/_analyze?analyzer=ik&text=%E4%B8%AD%E5%8D%8E%E4%BA%BA%E6%B0%91%E5%85%B1%E5%92%8C%E5%9B%BD
IK分词器
curl -XPUT -d ‘{“id”:1,“kw”:“我们都爱中华人民共和国”}’ http://192.168.168.101:9200/haha1/haha/1

Mapping

查看mapping
curl -XGET http://192.168.168.101:9200/jtdb_item/tb_item/_mapping

案例:

分页查询

  1. GET test*/_search
  2. {
  3. "size": 10,
  4. "from": 0 ,
  5. "query": {
  6. "term": {
  7. "member_age" : 62
  8. }
  9. }
  10. }

根据id更新数据

  1. POST {index}/_update/{id}
  2. {
  3. "doc":{
  4. "key":"value"
  5. }
  6. }

根据查询语句修改文档 POST {index}/_update_by_query

  1. 例句:查询member_gender为男的数据修改为女
  2. POST {index}/_update_by_query
  3. {
  4. "script":{
  5. "inline":"ctx._source.member_gender = params.member_gender",
  6. "params": {
  7. "member_gender": "女"
  8. }
  9. },
  10. "query":{
  11. "term":{
  12. "member_gender" : "男"
  13. }
  14. }
  15. }

聚合查询 查询最大最小值加条件

  1. POST wipro-headpic/_search?pretty
  2. {
  3. # 查询条件
  4. "query":{
  5. "term":{
  6. "member_age":3
  7. }
  8. },
  9. # 根据字段排序
  10. "sort": [
  11. {
  12. "member_age": {
  13. "order": "asc"
  14. }
  15. }
  16. ],
  17. "size": 2, # 显示条数
  18. # 聚合函数的聚合方法
  19. "aggs": {
  20. "maxage": {
  21. "max": {
  22. "field": "member_age"
  23. }
  24. }
  25. }
  26. }

去重计数 这个field字段只能去重int类型的

  1. POST wipro-headpic/_search?size=0
  2. {
  3. "aggs":{
  4. "age_count":{
  5. "value_count":{
  6. "field" : "ordernum"
  7. }
  8. },
  9. "name_count":{
  10. "cardinality":{
  11. "field" : "ordernum"
  12. }
  13. }
  14. }
  15. }

去重计数 可以添加.keyword后缀解决报错illegal_argument_exception 异常

  1. POST wipro-headpic/_search?size=1
  2. {
  3. "aggs":{
  4. "age_count":{
  5. "value_count":{
  6. "field" : "ordernum"
  7. }
  8. },
  9. "ordernum_count":{
  10. "cardinality":{
  11. "field" : "ordernum.keyword"
  12. }
  13. }
  14. }
  15. }

查看内存占用情况

  1. GET /_cat/segments/wipro-headpic?v&h=shard,segments,size,size.memory

所有segment占用的memory总和:

  1. GET /_cat/nodes?v&h=name,port,sm