1 评分机制详解
2 聚合入门
- 2.1 聚合查询
- 2.2 核心概念：bucket和metric
  - 2.2.1 bucket：一个数据分组
  - 2.2.2 metric：对一个数据分组执行的统计
3 电视案例
4 Java API实现聚合
5 ES7新特性之SQL

1 评分机制详解

1.1 评分机制TF/IDF

1.1.1 算法介绍

ElasticSearch使用的是term frequency/inverse document frequency算法，简称IF/IDF算法。TF：词频（term frequency），IDF：逆向文件频率（inverse document frequency）。
TF（Term Frequency）：搜索文本中的各个词条在每个文档的字段中出现了多少次，出现次数越多，就越相关。

Term Frequency.png

TF例子：index：medicine，搜索请求：阿莫西林。
- doc1：阿莫西林胶囊是什么。。。阿莫西林胶囊能做什么。。。阿莫西林胶囊结构。
- doc2：本药店有阿莫西林胶囊、红霉素胶囊、青霉素胶囊。。。。
- 结果：doc1>doc2。
IDF（Inverse Document Frequency）：搜索文本中的各个词条在整个索引的所有文档中出现了多少次，出现的次数越多，就越不相关。

Inverse Document Frequency.png

IDF例子：index：medicine，搜索请求：A市阿莫西林胶囊。
- doc1：A市健康大药房简介。本药店有阿莫西林胶囊、红霉素胶囊、青霉素胶囊。。。。
- doc2：B市民生大药房简介。本药店有阿莫西林胶囊、红霉素胶囊、青霉素胶囊。。。。
- doc3：C市未来大药房简介。本药店有阿莫西林胶囊、红霉素胶囊、青霉素胶囊。。。。
- 结论：整个索引库中出现少的词，相关度权重高。
Field-length norm：field长度，field越长，相关度越弱。
Field-length norm例子：index：medicine，搜索请求：A市阿莫西林胶囊。
- doc1：{"title":"A市健康大药房简介","content":"本药店有红霉素胶囊、青霉素胶囊。。。一万字"}。
- doc2：{"title":"B市民生大药房简介","content":"本药店有阿莫西林胶囊、红霉素胶囊、青霉素胶囊。。。一万字"}。
- 结论：doc1>doc2。

1.1.2 _score是如何被计算出来的？

对用户输入的关键字进行分词。
每个分词分别计算对每个匹配文档的tf值和idf值。
综合每个分词的tf/idf值，利用公式计算每个文档的总分。
示例：
开启explain执行计划：

GET /book/_search?explain=true
{
    "query": {
        "match": {
          "description": "java程序员"
        }
    }
}

{
  "took" : 655,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 3,
      "relation" : "eq"
    },
    "max_score" : 2.2765667,
    "hits" : [
      {
        "_shard" : "[book][0]",
        "_node" : "MTYvUxUWRSqSE2GLsX8_zg",
        "_index" : "book",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 2.2765667,
        "_source" : {
          "name" : "spring开发基础",
          "description" : "spring 在java领域非常流行，java程序员都在用。",
          "studymodel" : "201001",
          "price" : 88.6,
          "timestamp" : "2019-08-24 19:11:35",
          "pic" : "group1/M00/00/00/wKhlQFs6RCeAY0pHAAJx5ZjNDEM428.jpg",
          "tags" : [
            "spring",
            "java"
          ]
        },
        "_explanation" : {
          "value" : 2.2765667,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.79664993,
              "description" : "weight(description:java in 1) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.79664993,
                  "description" : "score(freq=2.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.47000363,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 2,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.7704485,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 2.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 16.0,
                          "description" : "dl, length of field",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value" : 0.18407845,
              "description" : "weight(description:程 in 1) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.18407845,
                  "description" : "score(freq=1.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.13353139,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 3,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.62660944,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 16.0,
                          "description" : "dl, length of field",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value" : 0.6479192,
              "description" : "weight(description:序 in 1) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.6479192,
                  "description" : "score(freq=1.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.47000363,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 2,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.62660944,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 16.0,
                          "description" : "dl, length of field",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value" : 0.6479192,
              "description" : "weight(description:员 in 1) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.6479192,
                  "description" : "score(freq=1.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.47000363,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 2,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.62660944,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 16.0,
                          "description" : "dl, length of field",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      },
      {
        "_shard" : "[book][0]",
        "_node" : "MTYvUxUWRSqSE2GLsX8_zg",
        "_index" : "book",
        "_type" : "_doc",
        "_id" : "1",
        "_score" : 0.94958127,
        "_source" : {
          "name" : "Bootstrap开发",
          "description" : "Bootstrap是由Twitter推出的一个前台页面开发css框架，是一个非常流行的开发框架，此框架集成了多种页面效果。此开发框架包含了大量的CSS、JS程序代码，可以帮助开发者（尤其是不擅长css页面开发的程序人员）轻松的实现一个css，不受浏览器限制的精美界面css效果。",
          "studymodel" : "201002",
          "price" : 38.6,
          "timestamp" : "2019-08-25 19:11:35",
          "pic" : "group1/M00/00/00/wKhlQFs6RCeAY0pHAAJx5ZjNDEM428.jpg",
          "tags" : [
            "bootstrap",
            "dev"
          ]
        },
        "_explanation" : {
          "value" : 0.94958127,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.13911864,
              "description" : "weight(description:程 in 0) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.13911864,
                  "description" : "score(freq=2.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.13353139,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 3,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.47356468,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 2.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 104.0,
                          "description" : "dl, length of field (approximate)",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value" : 0.48966968,
              "description" : "weight(description:序 in 0) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.48966968,
                  "description" : "score(freq=2.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.47000363,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 2,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.47356468,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 2.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 104.0,
                          "description" : "dl, length of field (approximate)",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value" : 0.3207929,
              "description" : "weight(description:员 in 0) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.3207929,
                  "description" : "score(freq=1.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.47000363,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 2,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.31024224,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 104.0,
                          "description" : "dl, length of field (approximate)",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      },
      {
        "_shard" : "[book][0]",
        "_node" : "MTYvUxUWRSqSE2GLsX8_zg",
        "_index" : "book",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 0.7534219,
        "_source" : {
          "name" : "java编程思想",
          "description" : "java语言是世界第一编程语言，在软件开发领域使用人数最多。",
          "studymodel" : "201001",
          "price" : 68.6,
          "timestamp" : "2019-08-25 19:11:35",
          "pic" : "group1/M00/00/00/wKhlQFs6RCeAY0pHAAJx5ZjNDEM428.jpg",
          "tags" : [
            "java",
            "dev"
          ]
        },
        "_explanation" : {
          "value" : 0.7534219,
          "description" : "sum of:",
          "details" : [
            {
              "value" : 0.5867282,
              "description" : "weight(description:java in 0) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.5867282,
                  "description" : "score(freq=1.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.47000363,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 2,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.567431,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 25.0,
                          "description" : "dl, length of field",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            },
            {
              "value" : 0.16669367,
              "description" : "weight(description:程 in 0) [PerFieldSimilarity], result of:",
              "details" : [
                {
                  "value" : 0.16669367,
                  "description" : "score(freq=1.0), computed as boost * idf * tf from:",
                  "details" : [
                    {
                      "value" : 2.2,
                      "description" : "boost",
                      "details" : [ ]
                    },
                    {
                      "value" : 0.13353139,
                      "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                      "details" : [
                        {
                          "value" : 3,
                          "description" : "n, number of documents containing term",
                          "details" : [ ]
                        },
                        {
                          "value" : 3,
                          "description" : "N, total number of documents with field",
                          "details" : [ ]
                        }
                      ]
                    },
                    {
                      "value" : 0.567431,
                      "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                      "details" : [
                        {
                          "value" : 1.0,
                          "description" : "freq, occurrences of term within document",
                          "details" : [ ]
                        },
                        {
                          "value" : 1.2,
                          "description" : "k1, term saturation parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 0.75,
                          "description" : "b, length normalization parameter",
                          "details" : [ ]
                        },
                        {
                          "value" : 25.0,
                          "description" : "dl, length of field",
                          "details" : [ ]
                        },
                        {
                          "value" : 48.666668,
                          "description" : "avgdl, average length of field",
                          "details" : [ ]
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          ]
        }
      }
    ]
  }
}

1.1.3 生产环境下判断某一个文档是否匹配上

示例：

GET /book/_explain/3
{
    "query": {
        "match": {
          "description": "java程序员"
        }
    }
}

1.2 Doc value

1.2.1 概述

搜索的时候，需要依靠倒排索引；排序的时候，需要依靠正排索引，看到每个document的每个field，然后进行排序，所谓的正排索引，其实就是doc value。
在建立索引的时候，一方面会建立倒排索引，以供搜索用；另一方面会建立正排索引，即doc value，以供排序、聚合、过滤等操作使用。
doc value是被保存在磁盘上的，如果内存足够，ES会自动将其缓存到内存中，性能会很高；但是，如果内存不够，OS会将其写入到磁盘上。

1.2.2 倒排索引应用案例

示例：
doc1：hello world you and me。
doc2：hi, world, how are you。 | term | doc1 | doc2 | | —- | —- | —- | | hello | | | | world | | | | you | | | | and | | | | me | | | | hi | | | | how | | | | are | | |

搜索的步骤：
①搜索hello you，会将hello you进行分词，分为hello和you。
②hello对应doc1。
③you对应doc1和doc2。
但是排序的时候，会出现问题。

1.2.3 正排索引的案例

示例：
doc1：{ "name": "jack", "age": 27 }。
doc2：{ "name": "tom", "age": 30 }。 | document | name | age | | —- | —- | —- | | doc1 | jack | 27 | | doc2 | tom | 30 |

正排索引在建立索引的时候，会将字段类似数据库表中的字段一样保存起来，也就是不进行分词，所以可以进行排序、聚合、过滤等操作。

1.3 ES分页的终极解决方案

1.3.1 前言

假设现在有3个主分片，分别为P0、P1和P2，每个分片上有1亿数据，用户需要查询开始索引为9999，条数为10，并且name的名称是java的数据。

ES如何分页前言假设条件.png

ES会分为两个步骤：
- query phase：ES会将前10000条的id，score等少量信息汇总到协调节点进行排序，然后获取到指定条数的数据的id，score等信息。
- fetch phase：协调节点将指定条数的数据的id、score等信息构建mget到各个分片上去获取数据，返回给客户端。

ES如果不这样做的原因：如果一次性的将前10000条数据全部汇总到ES的协调节点，一旦每条数据的信息过大，会造成协调节点宕机。

1.3.2 query phase

query phase工作流程：
- 搜索请求发送到某一个coordinate node，构建一个priority queue，长度以paging操作from和size为准，默认为10。
- coordinate node将请求转发到所有shard，每个shard本地搜索，并构建一个本地的priority queue。
- 各个shard将自己的priority queue返回给coordinate node，并构建一个全局的priority queue。
replica shard如何提升搜索吞吐量：一次请求要打到所有shard的一个replica/primary上去，如果每个shard都有多个replica，那么同时并发过来的搜索请求可以同时打到其他的replica上去。

1.3.3 fetch phase

fetch phbase工作流程：
- coordinate node构建完priority queue之后，就发送mget请求去所有shard上获取对应的document。
- 各个shard将document返回给coordinate node。
- coordinate node将合并后的document结果返回给client客户端。
一般搜索，如果不加from和size，就默认搜索前10条，按照_score排序。

1.4 搜索参数小结

1.4.1 preference

bouncing results问题（跳跃结果问题）：两个document排序，field值相同；不同的shard上，可能排序不同；每次请求轮询打到不同的replica shard上，每次页面看到的搜索结果的排序都不一样。
解决方案：将preference设置为一个字符串，比如说user_id，让每个user每次搜索的时候，都使用同一个replica shard去执行，就不会看到bouncing results问题。
preference参数决定了那些shard会被用来执行搜索问题。有_primary, _primary_first, _local,_only_node:xyz, _prefer_node:xyz,_shards:2,3。

GET /_search?preference=_shards:2,3

1.4.2 timeout

限定在一定时间内，将部分获取的数据直接返回，避免查询耗时过长。

GET /_search?timeout=10ms

1.4.3 routing

document文档路由，默认是_id路由，可以使用routing设置文档路由，让同一个请求发送到一个shard上去。

GET /_search?routing=user123

2 聚合入门

2.1 聚合查询

2.1.1 分组查询

SQL写法：

SELECT studymodel,count(*) as `count`
FROM book
group by studymodel;

Query DSL写法：

GET /book/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "count": {
      "terms": {
        "field": "studymodel"
      }
    }
  }
}

2.1.2 Terms分组查询

准备工作：

PUT /book/_mapping
{
  "properties": {
    "tags": {
      "type": "text",
      "fielddata": true
    }
  }
}

示例：按照tags进行分组查询

GET /book/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "count": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

2.1.3 加上搜索条件Terms分组查询

示例：按照tags进行分组，并查询name中包含java的信息

GET /book/_search
{
  "size": 0, 
  "query": {
     "match": {
       "name": "java"
     }
  },
  "aggs": {
    "count": {
      "terms": {
        "field": "tags"
      }
    }
  }
}

2.1.4 多字段分组查询

示例：先按照tags进行分组，再计算每个tag下面的平均价格

GET /book/_search
{
  "size": 0, 
   "query": {
     "match_all": {}
   },
   "aggs": {
     "group_by_tags": {
       "terms": {
         "field": "tags"
       },
       "aggs": {
         "avg_price": {
           "avg": {
             "field": "price"
           }
         }
       }
     }
   }
}

2.1.5 多字段分组查询排序

示例：先按照tags进行分组，计算每个tag下面的平均价格，并按照平均价格降序

GET /book/_search
{
  "size": 0, 
   "query": {
     "match_all": {}
   },
   "aggs": {
     "group_by_tags": {
       "terms": {
         "field": "tags",
          "order": {
            "avg_price": "desc"
          }
       },
       "aggs": {
         "avg_price": {
           "avg": {
             "field": "price"
           }
         }
       }
     }
   }
}

2.1.6 按照指定的价格范围进行分组查询

示例：

GET /book/_search
{
  "size": 0,
  "aggs": {
    "group_by_price": {
      "range": {
        "field": "price",
        "ranges": [
          {
            "from": 0,
            "to": 40
          },
          {
            "from": 40,
            "to": 60
          },
          {
            "from": 60,
            "to": 80
          }
        ]
      }
    }
  }
}

2.2 核心概念：bucket和metric

2.2.1 bucket：一个数据分组

city	name
北京	张三
北京	李四
天津	王五
天津	赵六
天津	田七

按照城市分组，就是分桶。select city,count(*) from table group by city。
划分出来两个bucket，一个就是北京bucket，另一个就是天津bucket。
北京bucket：包含2个人，张三、李四。
天津bucket：包含3个人，王五、赵六、田七。

2.2.2 metric：对一个数据分组执行的统计

metric：就是对一个bucket执行的某种聚合分析的操作，比如说求平均值，求最大值和求最小值等。

select city,count(*) from table group by city;

bucket：group by city指的是那些city相同的数据，就会被划分到一个bucket中。
metric：count(*)，对每个bucket中的所有数据，进行统计。

3 电视案例

3.1 准备工作

示例：创建索引和映射

PUT /tvs
{
    "mappings": {
        "properties": {
            "price": {
                "type": "long"
            },
            "color": {
                "type": "keyword"
            },
            "brand": {
                "type": "keyword"
            },
            "sold_date": {
                "type": "date"
            }
        }
    }
}

示例：插入数据

PUT /tvs/_bulk
{ "index": {}}
{ "price" : 1000, "color" : "红色", "brand" : "长虹", "sold_date" : "2019-10-28" }
{ "index": {}}
{ "price" : 2000, "color" : "红色", "brand" : "长虹", "sold_date" : "2019-11-05" }
{ "index": {}}
{ "price" : 3000, "color" : "绿色", "brand" : "小米", "sold_date" : "2019-05-18" }
{ "index": {}}
{ "price" : 1500, "color" : "蓝色", "brand" : "TCL", "sold_date" : "2019-07-02" }
{ "index": {}}
{ "price" : 1200, "color" : "绿色", "brand" : "TCL", "sold_date" : "2019-08-19" }
{ "index": {}}
{ "price" : 2000, "color" : "红色", "brand" : "长虹", "sold_date" : "2019-11-05" }
{ "index": {}}
{ "price" : 8000, "color" : "红色", "brand" : "三星", "sold_date" : "2020-01-01" }
{ "index": {}}
{ "price" : 2500, "color" : "蓝色", "brand" : "小米", "sold_date" : "2020-02-12" }

3.2 统计那种颜色的电视销量最高

示例：

GET /tvs/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "popular_color": {
      "terms": {
        "field": "color"
      }
    }
  }
}

3.3 统计每种颜色电视的平均价格

示例：

GET /tvs/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "color": {
      "terms": {
        "field": "color"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

3.4 统计每种颜色、平均价格下及每个颜色，每种品牌的平均价格

示例：

GET /tvs/_search
{
    "size": 0,
    "query": {
        "match_all": {}
    },
    "aggs": {
        "group_by_color": {
            "terms": {
                "field": "color"
            },
            "aggs": {
                "avg_color_price": {
                    "avg": {
                        "field": "price"
                    }
                },
                "group_by_brand": {
                    "terms": {
                        "field": "brand"
                    },
                    "aggs": {
                        "avg_brand_price": {
                            "avg": {
                                "field": "price"
                            }
                        }
                    }
                }
            }
        }
    }
}

3.5 统计每个颜色的销售数量、平均价格、最大价格、最小价格、价格总和

示例：

GET /tvs/_search
{
  "size": 0, 
  "query": {
    "match_all": {}
  },
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      },
      "aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        },
        "max_price":{
          "max": {
            "field": "price"
          }
        },
        "min_price":{
          "min": {
            "field": "price"
          }
        },
        "sum_price":{
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

3.6 划分范围 histogram

示例：按照价格每20一个区间，求每个区间的销售总额

GET /tvs/_search
{
  "size": 0,
  "aggs": {
    "price":{
      "histogram": {
        "field": "price",
        "interval": 20
      },
      "aggs": {
        "sum_price": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

3.7 按照日期分组聚合

示例：求每个月的销售个数

GET /tvs/_search
{
    "size": 0,
    "aggs": {
        "sales": {
            "date_histogram": {
                "field": "sold_date",
                "interval": "month",
                "format": "yyyy-MM-dd",
                "min_doc_count": 0,
                "extended_bounds": {
                    "min": "2019-01-01",
                    "max": "2019-12-31"
                }
            }
        }
    }
}

3.8 统计每个季度每个品牌的销售总额，以及每个季度的销售总额

示例：

GET /tvs/_search
{
    "size": 0,
    "aggs": {
        "group_by_sold_date": {
            "date_histogram": {
                "field": "sold_date",
                "interval": "quarter",
                "format": "yyyy-MM-dd",
                "min_doc_count": 0,
                "extended_bounds": {
                    "min": "2019-01-01",
                    "max": "2019-12-31"
                }
            },
            "aggs": {
              "group_by_brand": {
                "terms": {
                  "field": "brand"
                },
                "aggs": {
                  "sum_price": {
                    "sum": {
                      "field": "price"
                    }
                  }
                }
              },
              "total_price":{
                "sum": {
                  "field": "price"
                }
              }
            }
        }
    }
}

3.9 查询某个品牌按颜色的销量

示例：小米品牌的各种颜色的销量

GET /tvs/_search 
{
  "size": 0,
  "query": {
    "term": {
      "brand": {
        "value": "小米"
      }
    }
  },
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      }
    }
  }
}

3.10 单个品牌和所有品牌的均价对比

示例：

GET /tvs/_search 
{
  "size": 0, 
  "query": {
    "term": {
      "brand": {
        "value": "小米"
      }
    }
  },
  "aggs": {
    "single_brand_avg_price": {
      "avg": {
        "field": "price"
      }
    },
    "all": {
      "global": {},
      "aggs": {
        "all_brand_avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

3.11 统计价格大于1200的电视平均价格

示例：

GET /tvs/_search 
{
  "size": 0,
  "query": {
    "constant_score": {
      "filter": {
        "range": {
          "price": {
            "gte": 1200
          }
        }
      }
    } 
  },
  "aggs": {
    "avg_price": {
      "avg": {
        "field": "price"
      }
    }
  }
}

3.12 统计某品牌最近一个月的平均价格

示例：

GET /tvs/_search 
{
  "size": 0,
  "query": {
    "term": {
      "brand": {
        "value": "小米"
      }
    }
  },
  "aggs": {
    "recent_150d": {
      "filter": {
        "range": {
          "sold_date": {
            "gte": "now-150d"
          }
        }
      },
      "aggs": {
        "recent_150d_avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    },
    "recent_140d": {
      "filter": {
        "range": {
          "sold_date": {
            "gte": "now-140d"
          }
        }
      },
      "aggs": {
        "recent_140d_avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    },
    "recent_130d": {
      "filter": {
        "range": {
          "sold_date": {
            "gte": "now-130d"
          }
        }
      },
      "aggs": {
        "recent_130d_avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

4 Java API实现聚合

4.1 按照颜色分组，计算每个颜色卖出的个数

示例：Query DSL

GET /tvs/_search
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "group by color": {
      "terms": {
        "field": "color"
      }
    }
  }
}

示例：

package com.sunxaiping.elk;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import java.io.IOException;
import java.util.List;
/**
 * Query DSL
 */
@SpringBootTest
public class ElkApplicationTests {
    @Autowired
    private RestHighLevelClient client;
    /**
     * 
<pre>
     * GET /tvs/_search
     * {
     *   "size": 0,
     *   "query": {
     *     "match_all": {}
     *   },
     *   "aggs": {
     *     "group_by_color": {
     *       "terms": {
     *         "field": "color"
     *       }
     *     }
     *   }
     * }
     * </pre>
     */
    @Test
    public void test() throws IOException {
        //创建请求
        SearchRequest searchRequest = new SearchRequest("tvs");
        //请求体
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.size(0);
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());
        searchSourceBuilder.aggregation(AggregationBuilders.terms("group_by_color").field("color"));
        searchRequest.source(searchSourceBuilder);
        //发送请求
        SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
        //获取结果
        Aggregations aggregations = response.getAggregations();
        Terms terms = aggregations.get("group_by_color");
        List<? extends Terms.Bucket> buckets = terms.getBuckets();
        for (Terms.Bucket bucket : buckets) {
            String key = bucket.getKeyAsString();
            System.out.println("key = " + key);
            long docCount = bucket.getDocCount();
            System.out.println("docCount = " + docCount);
        }
    }
}

4.2 按照颜色分组，计算每个颜色卖出的个数以及每个颜色卖出的平均价格

示例：Query DSL

GET /tvs/_search
{
  "size": 0,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "group_by_color": {
      "terms": {
        "field": "color"
      },"aggs": {
        "avg_price": {
          "avg": {
            "field": "price"
          }
        }
      }
    }
  }
}

示例：

package com.sunxaiping.elk;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.metrics.Avg;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import java.io.IOException;
import java.util.List;
/**
 * Query DSL
 */
@SpringBootTest
public class ElkApplicationTests {
    @Autowired
    private RestHighLevelClient client;
    /**
     * 
<pre>
     * GET /tvs/_search
     * {
     *   "size": 0,
     *   "query": {
     *     "match_all": {}
     *   },
     *   "aggs": {
     *     "group_by_color": {
     *       "terms": {
     *         "field": "color"
     *       },"aggs": {
     *         "avg_price": {
     *           "avg": {
     *             "field": "price"
     *           }
     *         }
     *       }
     *     }
     *   }
     * }
     * </pre>
     */
    @Test
    public void test() throws IOException {
        //创建请求
        SearchRequest searchRequest = new SearchRequest("tvs");
        //请求体
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.size(0);
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());
        searchSourceBuilder.aggregation(AggregationBuilders.terms("group_by_color").field("color").subAggregation(AggregationBuilders.avg("avg_price").field("price")));
        searchRequest.source(searchSourceBuilder);
        //发送请求
        SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
        //获取结果
        Aggregations aggregations = response.getAggregations();
        Terms terms = aggregations.get("group_by_color");
        List<? extends Terms.Bucket> buckets = terms.getBuckets();
        for (Terms.Bucket bucket : buckets) {
            String key = bucket.getKeyAsString();
            System.out.println("key = " + key);
            long docCount = bucket.getDocCount();
            System.out.println("docCount = " + docCount);
            Avg avgPrice = bucket.getAggregations().get("avg_price");
            double value = avgPrice.getValue();
            System.out.println("value = " + value);
        }
    }
}

4.3 按照颜色分组，计算每个颜色卖出的个数，每个颜色卖出价格的平均值、最大值、最小值和总和

示例：Query DSL

GET /tvs/_search
{
    "size": 0,
    "query": {
        "match_all": {}
    },
    "aggs": {
        "group_by_color": {
            "terms": {
                "field": "color"
            },
            "aggs": {
                "avg_price": {
                    "avg": {
                        "field": "price"
                    }
                },
                "max_price": {
                    "max": {
                        "field": "price"
                    }
                },
                "min_price": {
                    "min": {
                        "field": "price"
                    }
                },
                "sum_price":{
                  "sum": {
                    "field": "price"
                  }
                }
            }
        }
    }
}

示例：

package com.sunxaiping.elk;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregationBuilder;
import org.elasticsearch.search.aggregations.metrics.Avg;
import org.elasticsearch.search.aggregations.metrics.Max;
import org.elasticsearch.search.aggregations.metrics.Min;
import org.elasticsearch.search.aggregations.metrics.Sum;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import java.io.IOException;
import java.util.List;
/**
 * Query DSL
 */
@SpringBootTest
public class ElkApplicationTests {
    @Autowired
    private RestHighLevelClient client;
    /**
     * 
<pre>
     * GET /tvs/_search
     * {
     *     "size": 0,
     *     "query": {
     *         "match_all": {}
     *     },
     *     "aggs": {
     *         "group_by_color": {
     *             "terms": {
     *                 "field": "color"
     *             },
     *             "aggs": {
     *                 "avg_price": {
     *                     "avg": {
     *                         "field": "price"
     *                     }
     *                 },
     *                 "max_price": {
     *                     "max": {
     *                         "field": "price"
     *                     }
     *                 },
     *                 "min_price": {
     *                     "min": {
     *                         "field": "price"
     *                     }
     *                 },
     *                 "sum_price":{
     *                   "sum": {
     *                     "field": "price"
     *                   }
     *                 }
     *             }
     *         }
     *     }
     * }
     * </pre>
     */
    @Test
    public void test() throws IOException {
        //创建请求
        SearchRequest searchRequest = new SearchRequest("tvs");
        //请求体
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.size(0);
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());
        TermsAggregationBuilder termsAggregationBuilder = AggregationBuilders.terms("group_by_color").field("color");
        termsAggregationBuilder.subAggregation(AggregationBuilders.avg("avg_price").field("price"));
        termsAggregationBuilder.subAggregation(AggregationBuilders.max("max_price").field("price"));
        termsAggregationBuilder.subAggregation(AggregationBuilders.min("min_price").field("price"));
        termsAggregationBuilder.subAggregation(AggregationBuilders.sum("sum_price").field("price"));
        searchSourceBuilder.aggregation(termsAggregationBuilder);
        searchRequest.source(searchSourceBuilder);
        //发送请求
        SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
        //获取结果
        Aggregations aggregations = response.getAggregations();
        Terms terms = aggregations.get("group_by_color");
        List<? extends Terms.Bucket> buckets = terms.getBuckets();
        for (Terms.Bucket bucket : buckets) {
            String key = bucket.getKeyAsString();
            System.out.println("key = " + key);
            long docCount = bucket.getDocCount();
            System.out.println("docCount = " + docCount);
            Avg avgPrice = bucket.getAggregations().get("avg_price");
            System.out.println("avgPrice = " + avgPrice.getValue());
            Max maxPrice = bucket.getAggregations().get("max_price");
            System.out.println("maxPrice = " + maxPrice.getValue());
            Min minPrice = bucket.getAggregations().get("min_price");
            System.out.println("minPrice = " + minPrice.getValue());
            Sum sumPrice = bucket.getAggregations().get("sum_price");
            System.out.println("sumPrice = " + sumPrice.getValue());
        }
    }
}

4.4 按照售价每2000价格划分范围，算出每个区间的销售总额

示例：Query DSL

GET /tvs/_search
{
    "size": 0,
    "query": {
        "match_all": {}
    },
    "aggs": {
      "price": {
        "histogram": {
          "field": "price",
          "interval": 2000
        },
        "aggs": {
          "sum_price": {
            "sum": {
              "field": "price"
            }
          }
        }
      }
    }
}

示例：

package com.sunxaiping.elk;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.aggregations.AggregationBuilders;
import org.elasticsearch.search.aggregations.Aggregations;
import org.elasticsearch.search.aggregations.bucket.histogram.Histogram;
import org.elasticsearch.search.aggregations.bucket.histogram.HistogramAggregationBuilder;
import org.elasticsearch.search.aggregations.metrics.Sum;
import org.elasticsearch.search.builder.SearchSourceBuilder;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import java.io.IOException;
import java.util.List;
/**
 * Query DSL
 */
@SpringBootTest
public class ElkApplicationTests {
    @Autowired
    private RestHighLevelClient client;
    /**
     * 
<pre>
     * GET /tvs/_search
     * {
     *     "size": 0,
     *     "query": {
     *         "match_all": {}
     *     },
     *     "aggs": {
     *       "price": {
     *         "histogram": {
     *           "field": "price",
     *           "interval": 2000
     *         },
     *         "aggs": {
     *           "sum_price": {
     *             "sum": {
     *               "field": "price"
     *             }
     *           }
     *         }
     *       }
     *     }
     * }
     * </pre>
     */
    @Test
    public void test() throws IOException {
        //创建请求
        SearchRequest searchRequest = new SearchRequest("tvs");
        //请求体
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.size(0);
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());
        HistogramAggregationBuilder histogramAggregationBuilder = AggregationBuilders.histogram("price").field("price").interval(2000);
        histogramAggregationBuilder.subAggregation(AggregationBuilders.sum("sum_price").field("price"));
        searchSourceBuilder.aggregation(histogramAggregationBuilder);
        searchRequest.source(searchSourceBuilder);
        //发送请求
        SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
        //获取结果
        Aggregations aggregations = response.getAggregations();
        Histogram histogram = aggregations.get("price");
        List<? extends Histogram.Bucket> buckets = histogram.getBuckets();
        for (Histogram.Bucket bucket : buckets) {
            long docCount = bucket.getDocCount();
            System.out.println("docCount = " + docCount);
            Sum sumPrice = bucket.getAggregations().get("sum_price");
            System.out.println("sumPrice = " + sumPrice.getValue());
        }
    }
}

5 ES7新特性之SQL

5.1 快速入门

示例：

POST /_sql?format=txt
{
  "query": "SELECT * FROM tvs"
}

5.2 获取结果方式

HTTP请求。
客户端：elasticsearch-sql-cli.bat。
代码，类似于JDBC。

5.3 响应数据格式

响应数据格式.png

5.4 SQL翻译

示例：
验证SQL翻译：

POST /_sql/translate
{
  "query": "SELECT * FROM tvs"
}

{
  "size" : 1000,
  "_source" : {
    "includes" : [
      "price"
    ],
    "excludes" : [ ]
  },
  "docvalue_fields" : [
    {
      "field" : "brand"
    },
    {
      "field" : "color"
    },
    {
      "field" : "sold_date",
      "format" : "epoch_millis"
    }
  ],
  "sort" : [
    {
      "_doc" : {
        "order" : "asc"
      }
    }
  ]
}

5.5 和其他DSL的结合

示例：

POST /_sql?format=txt
{
  "query": "SELECT * FROM tvs",
  "filter":{
    "range": {
      "price": {
        "gte": 1000,
        "lte": 2000
      }
    }
  }
}

5.6 Java代码实现SQL功能

5.6.1 前提

需要开启ES的白金版功能。

开启ES的白金版功能1.png

开启ES的白金版功能2.png

大型企业可以购买白金版，增加machine Learning，高级安全性x-pack。

5.6.2 准备工作

导入相关依赖：

<dependency>
    <groupId>org.elasticsearch.plugin</groupId>
    <artifactId>x-pack-sql-jdbc</artifactId>
    <version>7.10.0</version>
</dependency>

5.6.3 应用示例

示例：

package com.sunxaiping.elk;
import org.junit.jupiter.api.Test;
import org.springframework.boot.test.context.SpringBootTest;
import java.sql.*;
@SpringBootTest
public class ElkApplicationTests {
    @Test
    public void test() throws Exception {
        Class.forName("org.elasticsearch.xpack.sql.jdbc.EsDriver");
        Connection connection = DriverManager.getConnection("jdbc:es://http://localhost:9200");
        Statement statement = connection.createStatement();
        ResultSet resultSet = statement.executeQuery("SELECT brand,color,price,sold_date FROM tvs");
        while (resultSet.next()) {
            System.out.println(resultSet.getString(1));
            System.out.println(resultSet.getString(2));
            System.out.println(resultSet.getDouble(3));
            System.out.println(resultSet.getDate(4));
        }
    }
}

ELK05

1 评分机制详解

1.1 评分机制TF/IDF

1.1.1 算法介绍

1.1.2 _score是如何被计算出来的？

1.1.3 生产环境下判断某一个文档是否匹配上

1.2 Doc value

1.2.1 概述

1.2.2 倒排索引应用案例

1.2.3 正排索引的案例

1.3 ES分页的终极解决方案

1.3.1 前言

1.3.2 query phase

1.3.3 fetch phase

1.4 搜索参数小结

1.4.1 preference

1.4.2 timeout

1.4.3 routing

2 聚合入门

2.1 聚合查询

2.1.1 分组查询

2.1.2 Terms分组查询

2.1.3 加上搜索条件Terms分组查询

2.1.4 多字段分组查询

2.1.5 多字段分组查询排序

2.1.6 按照指定的价格范围进行分组查询

2.2 核心概念：bucket和metric

2.2.1 bucket：一个数据分组

2.2.2 metric：对一个数据分组执行的统计

3 电视案例

3.1 准备工作

3.2 统计那种颜色的电视销量最高

3.3 统计每种颜色电视的平均价格

3.4 统计每种颜色、平均价格下及每个颜色，每种品牌的平均价格

3.5 统计每个颜色的销售数量、平均价格、最大价格、最小价格、价格总和

3.6 划分范围 histogram

3.7 按照日期分组聚合

3.8 统计每个季度每个品牌的销售总额，以及每个季度的销售总额

3.9 查询某个品牌按颜色的销量

3.10 单个品牌和所有品牌的均价对比

3.11 统计价格大于1200的电视平均价格

3.12 统计某品牌最近一个月的平均价格

4 Java API实现聚合

4.1 按照颜色分组，计算每个颜色卖出的个数

4.2 按照颜色分组，计算每个颜色卖出的个数以及每个颜色卖出的平均价格

4.3 按照颜色分组，计算每个颜色卖出的个数，每个颜色卖出价格的平均值、最大值、最小值和总和

4.4 按照售价每2000价格划分范围，算出每个区间的销售总额

5 ES7新特性之SQL

5.1 快速入门

5.2 获取结果方式

5.3 响应数据格式

5.4 SQL翻译

5.5 和其他DSL的结合

5.6 Java代码实现SQL功能

5.6.1 前提

5.6.2 准备工作

5.6.3 应用示例