normalizer(归一化)
原文链接 : https://www.elastic.co/guide/en/elasticsearch/reference/current/normalizer.html
译文链接 : normalizer(归一化)
keyword** fields(关键字字段) 的 normalizer(归一化)属性与 analyzer(分析器)类似,只不过它保证 analysis chain(分析链)生成单一的 token **(词元).
normalizer(归一化)应用于索引 keyword(关键字)之前,以及诸如在 [match](https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-match-query.html) query(匹配查询),查询解析器搜索 keyword** fields**(关键字字段)的时候.
curl -XPUT 'localhost:9200/index?pretty' -H 'Content-Type: application/json' -d'{"settings": {"analysis": {"normalizer": {"my_normalizer": {"type": "custom","char_filter": [],"filter": ["lowercase", "asciifolding"]}}}},"mappings": {"type": {"properties": {"foo": {"type": "keyword","normalizer": "my_normalizer"}}}}}'curl -XPUT 'localhost:9200/index/type/1?pretty' -H 'Content-Type: application/json' -d'{"foo": "BÀR"}'curl -XPUT 'localhost:9200/index/type/2?pretty' -H 'Content-Type: application/json' -d'{"foo": "bar"}'curl -XPUT 'localhost:9200/index/type/3?pretty' -H 'Content-Type: application/json' -d'{"foo": "baz"}'curl -XPOST 'localhost:9200/index/_refresh?pretty'curl -XGET 'localhost:9200/index/_search?pretty' -H 'Content-Type: application/json' -d'{"query": {"match": {"foo": "BAR"}}}'
上述查询与 documents(文档) 1 和 2 相匹配,这是因为在索引和查询的时候都将 BAR转换为了 bar.
{"took": $body.took,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 2,"max_score": 0.2876821,"hits": [{"_index": "index","_type": "type","_id": "2","_score": 0.2876821,"_source": {"foo": "bar"}},{"_index": "index","_type": "type","_id": "1","_score": 0.2876821,"_source": {"foo": "BÀR"}}]}}
此外,keywords 在索引之前转换意味着聚合返回 normalised values(归一化的值):
curl -XGET 'localhost:9200/index/_search?pretty' -H 'Content-Type: application/json' -d'{"size": 0,"aggs": {"foo_terms": {"terms": {"field": "foo"}}}}'
返回
{"took": 43,"timed_out": false,"_shards": {"total": 5,"successful": 5,"failed": 0},"hits": {"total": 3,"max_score": 0.0,"hits": []},"aggregations": {"foo_terms": {"doc_count_error_upper_bound": 0,"sum_other_doc_count": 0,"buckets": [{"key": "bar","doc_count": 2},{"key": "baz","doc_count": 1}]}}}
