自带组件

比较简单,但是修改同义词库需要重启es。

1、同义词词库

准备一个同义词词库,每行一个同义词词组,例子 elasticsearch/config/analysis/syno.dic

  1. 西红柿,番茄,tomato
  2. 马铃薯,土豆

2、配置分词器

  1. PUT /people_v3
  2. {
  3. "settings": {
  4. "number_of_shards": 1,
  5. "number_of_replicas": 0,
  6. "analysis": {
  7. "filter": {
  8. "my_synonym_filter": {
  9. "type": "synonym",
  10. "synonyms_path" : "analysis/syno.dic"
  11. }
  12. },
  13. "analyzer": {
  14. "my_synonyms": {
  15. "tokenizer": "ik_max_word",
  16. "filter": [
  17. "lowercase",
  18. "my_synonym_filter"
  19. ]
  20. }
  21. }
  22. }
  23. }
  24. }

效果:

  1. GET people_v3/_analyze
  2. {
  3. "text": "我爱土豆",
  4. "analyzer": "my_synonyms"
  5. }
  6. {
  7. "tokens": [
  8. {
  9. "token": "我",
  10. "start_offset": 0,
  11. "end_offset": 1,
  12. "type": "CN_CHAR",
  13. "position": 0
  14. },
  15. {
  16. "token": "爱",
  17. "start_offset": 1,
  18. "end_offset": 2,
  19. "type": "CN_CHAR",
  20. "position": 1
  21. },
  22. {
  23. "token": "土豆",
  24. "start_offset": 2,
  25. "end_offset": 4,
  26. "type": "CN_WORD",
  27. "position": 2
  28. },
  29. {
  30. "token": "马铃薯",
  31. "start_offset": 2,
  32. "end_offset": 4,
  33. "type": "SYNONYM",
  34. "position": 2
  35. }
  36. ]
  37. }

第三方组件