提炼:
1.索引一个文档有两种方式,一种是使用自定义的 ID 索引文档,PUT /{index}/{type}/{id}
一种是不传递 ID索引文档 POST /{index}/{type}/。
不传递 ID 索引文档,会生成一个自动的 ID (20 character long, URL-safe, Base64-encoded GUID strings
)
2.介绍了第四种 metadata —- _version
Every document in Elasticsearch has a version number. Every time a change is made to a document(including deleting it), the _version number is incremented.
原文:
通过使用 index API, 文档可以被索引 — 存储和是文档可被搜索。但是首先, 我们要确定文档的位置。正如我们刚刚讨论过的,一个文档的 _inde/_type/_id 唯一标识一个文档。我们可以提供自定义的 _id 值,或者让 index API 自动生成。
Documents are indexed — stored and made searchable -by using the index API. But first, we need to decide where the document lives. As we just discussed, a document’s _indes, _type and _id uniquely identify the document. We can either provide our own _id value or let the index API generate one for us.
使用自定义的 ID (Using Our Own ID)
如果你的文档有一个自然的标识符(例如,一个 user_account 字段或其他标识文档的值), 你应该使用如下方式的 index API 并提供你自己的 _id :
If your document has a natural identifier (for example, a user_account field or some other value that identifies the document), you should provide your own _id, using this from of the Index API:
PUT /{index}/{type}/{id} { “field”: “value”, }
举个例子,如果我们的索引称为 website, 类型称为 blog, 并且选择 123 作为 ID, 那么索引请求应该是下面这样:
For example, if our index is called website, our type is called blog, and we choose the ID 123, then the index request looks like this:
PUT /website/blog/123 { “title”:”My first blog entry”, “text”: “Just trying this out…”, “date”: “2014/01/01” }
Elasticsearch 响应体如下所示:
Elasticsearch responds as follows:
{ “_index”: “website”, “_type”: “blog”, “_id”: “123”, “_version”: 1, “created”: true }
该响应表明文档已经成功创建,该索引包括 _index/_type/_id 元数据,以及一个新元素: _version.
The response indicates that the document has been successfully created and includes the _index/_type/_id metadata and a new element: _version.
在 Elasticsearch 中每个文档都有一个版本号。当每次对文档进行修改时(包括删除), _version 的值会递增。 在处理冲突中,我们讨论了怎样使用 _version 号码确保你的应用程序中的一部分修改不会覆盖另一部分所做的修改。
Every document in Elasticsearch has a version number. Every time a change is made to a document(including deleting it), the _version number is incremented. In Dealing with Conflicts, we discuss how to use the _version number to ensure that one part of your application doesn’t overwrite change made by another part.
Autogenerating identifies
如果你的数据没有自然的 ID, Elasticsearch 可以帮我们自动生成 ID。 请求的接口调整为 : 不再使用 PUT 谓词(使用这个 URL 存储这个文档), 而是使用 POST 谓词(“存储文档在这个 URL 命名空间下”)。
If our data doesn’t have a natural ID, we can let Elasticsearch autogenerate one for us. The structure of the request changes: instead of using the PUT verb(“store this document at this URL”), we use the POST verb(“store this document under this URL”).
The URL now contains just the _index and the _type:
POST /website/blog/ { “title”: “My second blog entry”, “text”: “Still trying this out…”, “date”: “2014/01/01” }
The response is similar to what we saw before, except that the _id field has been generated for us:
{ “_index”: “website”, “_type”: “blog”, “_id”: “AVFgSgVHUP18jI2wRx0w”, “_version”: 1, “created”: true }
自动生成的 ID 是 URL-safe, 基于 Base64 编码且长度为 20 个字符的 GUID 字符串。这些 GUID 字符串由可修改的 FlakeID 模式生成, 这种模式允许多个节点并行生成唯一 ID, 且互相之间的冲突概率几乎为 0。
Autogenerated IDs are 20 character long, URL-safe, Base64-encoded GUID strings. These GUIDS are gengrated from a modified FlakeID scheme which allows mutiple nodes to be generating unique IDs in parallel with essentially zero chance of collision.