搜索的时候,可能输入的搜索文本会出现误拼写的情况
doc1: hello world
doc2: hello java
搜索:hallo world
fuzzy
搜索技术 —> 自动将拼写错误的搜索文本,进行纠正,纠正以后去尝试匹配索引中的数据
POST /my_index/_bulk
{ "index": { "_id": 1 }}
{ "title": "Surprise me!"}
{ "index": { "_id": 2 }}
{ "title": "That was surprising."}
{ "index": { "_id": 3 }}
{ "title": "I wasn't surprised."}
GET /my_index/_search
{
"query": {
"fuzzy": {
"title": {
"value": "surprize",
"fuzziness": 2
}
}
}
}
surprize
—> 拼写错误 —> surprise
—> s -> zsurprize
—> surprise
-> z -> s,纠正一个字母,就可以匹配上,所以在fuziness
指定的2范围内surprize
—> surprised
-> z -> s,末尾加个d,纠正了2次,也可以匹配上,在fuziness
指定的2范围内surprize
—> surprising
-> z -> s,去掉e,ing,3次,总共要5次,才可以匹配上,始终纠正不了fuzzy
搜索以后,会自动尝试将你的搜索文本进行纠错,然后去跟文本进行匹配fuzziness
,你的搜索文本最多可以纠正几个字母去跟你的数据进行匹配,默认如果不设置,就是2
GET /my_index/_search
{
"query": {
"match": {
"title": {
"query": "SURPIZE ME",
"fuzziness": "AUTO",
"operator": "and"
}
}
}
}
结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.8729758,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.8729758,
"_source" : {
"title" : "Surprise me!"
}
}
]
}
}
说明:案例中的 surprising
在 mapping 如下的 my_index
才能实现:
{
"my_index" : {
"mappings" : {
"properties" : {
"title" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
}
如果 my_index
的 mapping
如下:
{
"my_index" : {
"mappings" : {
"properties" : {
"title" : {
"type" : "text",
"analyzer" : "autocomplete",
"search_analyzer" : "standard"
}
}
}
}
}
则如下搜索当 fuzziness = 2
时会将三条 doc 都查询出来
GET /my_index/_search
{
"query": {
"fuzzy": {
"title": {
"value": "surprize",
"fuzziness": 2
}
}
}
}