Keep Words Token Filter(保留字过滤器)
原文链接 : https://www.elastic.co/guide/en/elasticsearch/reference/5.3/getting-started.html
译文链接 : http://www.apache.wiki/pages/viewpage.action?pageId=10028810
简述
当词元过滤器中的type为keep时,表示只保留具有预定义单词集中的文本的token。可以在设置中定义一组单词,或者从包含每行一个单词的文本文件加载。
参数
| keep_words | 要保留的单词列表 |
| keep_words_path | 一个文字文件的路径 |
| keep_words_case | 一个布尔值,表示是否小写单词(默认为false ) |
示例
PUT /keep_words_example{"settings" : {"analysis" : {"analyzer" : {"example_1" : {"tokenizer" : "standard","filter" : ["standard", "lowercase", "words_till_three"]},"example_2" : {"tokenizer" : "standard","filter" : ["standard", "lowercase", "words_in_file"]}},"filter" : {"words_till_three" : {"type" : "keep","keep_words" : [ "one", "two", "three"]},"words_in_file" : {"type" : "keep","keep_words_path" : "analysis/example_word_list.txt"}}}}}
