ASCII Folding Token Filter(ASCII Folding 词元过滤器)

原文链接 : https://www.elastic.co/guide/en/elasticsearch/reference/current/analysis-asciifolding-tokenfilter.html

译文链接 : http://www.apache.wiki/pages/viewpage.action?pageId=10027030

贡献者 : fuckerApacheCNApache中文网

asciifolding 类型的词元过滤器,将不在前127个ASCII字符(“基本拉丁文”Unicode块)中的字母,数字和符号Unicode字符转换为ASCII等效项(如果存在)。

例如:

  1. "index" : {
  2. "analysis" : {
  3. "analyzer" : {
  4. "default" : {
  5. "tokenizer" : "standard",
  6. "filter" : ["standard", "asciifolding"]
  7. }
  8. }
  9. }
  10. }

接受默认为 falsepreserve_original 设置,但如果为 true ,则将保留原始 token 并发出 folded token

例如:

  1. "index" : {
  2. "analysis" : {
  3. "analyzer" : {
  4. "default" : {
  5. "tokenizer" : "standard",
  6. "filter" : ["standard", "my_ascii_folding"]
  7. }
  8. },
  9. "filter" : {
  10. "my_ascii_folding" : {
  11. "type" : "asciifolding",
  12. "preserve_original" : true
  13. }
  14. }
  15. }
  16. }