数据类型 - 关联关系类型 - 《elasticsearch》

关系
join 与 nested 方式
nested
join（父子文档）
nested 类型使用
- 说明
- 示例

关系

以下四种常用的方法，用来在 Elasticsearch 中进行关系型数据的管理：

join 类型父子文档可以独立更新，适合子文档频繁变更的情况；
join 类型需要维护联接关系，使用 has_child 和 has_parent 查询性能差，显著增加查询时间；
join 类型方式通过父文档字段排序子文档，支持较差；

nested

join（父子文档）

说明
join 类型唯一有意义的使用情况是，数据包含一对多关系，其中一个实体明显多于另一个实体，可以多层级；
子文档与父文档存储于一个索引中，正常查询会返回所有的文档（父子文档都包括）；
父子文档必须索引在一个分片上，使用 routing 参数实现；
子文档按父文档字段排序较困难，只支持父文档字段类型为数字的排序；
尽量少的使用父子关系，仅在子文档远多于父文档时使用；
避免在一个查询中使用多个父子联合语句；
在 has_child 查询中使用 filter 上下文，或者设置 score_mode 为 none 来避免计算文档得分；
保证父文档 IDs 尽量短，以便在 doc values 中更好地压缩，被临时载入时占用更少的内存。
多代文档使用

联合越多，性能越差；

每一代父文档都要将其字符串类型的 _id 字段存储到内存里，这会占用大量内存。

示例

字段映射

PUT my_test
{
"mappings": {
  "_doc": {
    "dynamic": "strict",
    "properties": {
      "id": {
        "type": "keyword"
      },
      "ver": {
        "type": "keyword"
      },
      "title": {
        "type": "keyword"
      },
      "text": {
        "type": "keyword"
      },
      "reads": {
        "type": "integer"
      },
      "comments": {
        "type": "integer"
      },
      "article_created_at": {
        "type": "date",
        "format": "epoch_second"
      },
      "connection": {
        "type": "join", // join 类型
        "relations": { 
          "article": "version" // 父子关系 父/子
        }
      }
    }
  }
}
}

测试数据

父文档

PUT my_test/_doc/1?routing=1&refresh
{
"text": "article a1",
"connection": {
  "name": "article"
},
"id": 1,
"article_created_at": 1583731094,
"reads": 132
}
PUT my_test/_doc/2?routing=1&refresh
{
"text": "article a2",
"connection": {
  "name": "article"
},
"id": 2,
"article_created_at": 1583731095,
"reads": 89
}
PUT my_test/_doc/3?routing=1&refresh
{
"text": "article a3",
"connection": {
  "name": "article"
},
"id": 3,
"article_created_at": 1583731096,
"reads": 32
}

子文档

// 为父文档为 1 的文章增加版本
PUT my_test/_doc/11?routing=1&refresh 
{
"text": "ver v1 belong to a1",
"connection": {
  "name": "version",
  "parent": "1"
},
"ver": 111,
"comments": 100
}
// 为父文档为 1 的文章增加版本
PUT my_test/_doc/12?routing=1&refresh
{
"text": "ver v2 belong to a1",
"connection": {
  "name": "version",
  "parent": "1"
},
"ver": 222,
"comments": 200
}
// 为父文档为 2 的文章增加版本
PUT my_test/_doc/13?routing=1&refresh
{
"text": "ver vv1 belong to a2",
"connection": {
  "name": "version",
  "parent": "2"
},
"ver": 1111,
"comments": 200
}

查询

查询某个父文档下的子文档

GET my_test/_search
{
"query": {
  "parent_id": {
    "type": "version",
    "id": "1"
  }
}
}

返回相应的子文档，不包括父文档；

按父文档的阅读量评分排序（返回子文档）

GET my_test/_search
{
"query": {
  "has_parent": {
    "parent_type": "article",
    "score": true, //开启对父文档评分
    "query": {
      "function_score": {
        "script_score": {
          "script": "_score * doc['reads'].value"
        }
      }
    }
  }
}
}

查询子文档中 comments 等于 200（返回父文档）

{
  "query": {
      "has_child" : {
          "type" : "version",
          "query" : {
              "term" : {
                  "comments" : 200
              }
          }
      }
  }
}

nested 类型使用

因为 es 会将数组中的子文档进行扁平化存储，建立索引，如果不使用 nested 类型会造成匹配判断混乱。示例：

{
  "questions": [
      {
          "title": "aaa",
          "comments": 10
      },
      {
          "title": "bbb",
          "comments": 13
      }
  ]
}

不使用 nested 类型会造成查询 title = ‘aaa’ 且 comments = 13 匹配到该文档。

说明

子文档与父文档存储与同一个文档中；

示例

PUT my_test2/_doc/1
{
"title": "what kind of animals",
"text": "animals",
"questions": [
 {
   "content": "dog",
   "created_at": 1583742663
 },
 {
   "content": "cat",
   "created_at": 1583742664
 },
 {
   "content": "cock",
   "created_at": 1583742665
 },
     {
   "content": "bird",
   "created_at": 1583742666
 }
]
}

查询子文档内容为 XX，且创建时间为 AA 的稿件（一个子文档必须同时满足多个条件）

GET my_test2/_search
{
"query": {
  "nested": {
    "path": "questions",
    "query": {
      "bool": {
        "must": [
          {
            "term": {
              "questions.content": {
                "value": "dog"
              }
            }
          },
          {
            "term": {
              "questions.created_at": {
                "value": "1583742663"
              }
            }
          }
        ]
      }
    }
  }
}
}

参考：
es 6.3 官方文档

关联关系类型

关系

join 与 nested 方式

nested

join（父子文档）

说明

多代文档使用

示例

字段映射

测试数据

查询

nested 类型使用

说明

示例