1. keyword
用于索引结构化内容(如电子邮件地址、主机名、状态码、邮政编码或标记)的字段。
通常用于筛选(查找发布状态为已发布的所有博客文章)、排序和聚合。keyword字段只能按其精确值进行搜索。(为整个字段值建立索引,而不会对其进行分词)
2. text
使用text类型定义的字段,es在建立索引的时候,会对该字段值进行分词(使用默认的分词器,或用户指定的分词器)。
3. 实例理解keyword和text的区别
3.1 建立一个索引,构造测试数据
索引名称my-test,索引类型为shops。
其中,shop_name字段类型为keyword类型(不分词),business_desc字段类型为text类型,采用ik_max_word分词器进行分词。
PUT /my-test{"mappings":{"shops":{"properties":{"shop_id":{"type":"long"},"shop_name":{"type":"keyword"},"business_desc":{"type":"text","analyzer":"ik_max_word"},"members":{"type":"nested","dynamic":false,"properties":{"member_id":{"type":"long"},"member_name":{"type":"keyword"}}}}}}}
构造两个测试数据:
PUT /my-test/shops/1{"ship_id" : 10000001,"shop_name": "王尼玛的电器小店","business_desc": "主营:家用电器、电脑设备、办公设备、安防设备、手机、耳机等","members" : [{"member_id" :1,"member_name" : "王大大"},{"member_id" : 2,"member_name" : "王二二"}]}
PUT /my-test/shops/2{"ship_id" : 10000002,"shop_name": "王大锤的创意小店","business_desc": "主营:创意品,饰品,房间挂件,汽车挂件","members" : [{"member_id" :1,"member_name" : "张三"},{"member_id" : 2,"member_name" : "李四"}]}
3.2 keyword字段查询(不分词)
3.2.1 测试实例1
以“王尼玛”为搜索内容,搜索shop_name为“王尼玛”的数据:
GET /my-test/shops/_search{"query":{"bool":{"filter":{"term":{"shop_name":"王尼玛"}}}}}
查询结果:匹配到 0 个文档,因为 shop_name 只有一个 “王尼玛的电器小店” 倒排索引
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 0,"max_score" : null,"hits" : [ ]}}
3.2.2 测试实例2
搜索shop_name为“王尼玛的电器小店”的文档:
GET /my-test/shops/_search{"query":{"bool":{"filter":{"term":{"shop_name":"王尼玛的电器小店"}}}}}
查询结果:匹配到 1 个文档,因为 shop_name 有一个 “王尼玛的电器小店” 倒排索引
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 1,"max_score" : 0.0,"hits" : [{"_index" : "my-test","_type" : "shops","_id" : "1","_score" : 0.0,"_source" : {"ship_id" : 10000001,"shop_name" : "王尼玛的电器小店","business_desc" : "主营:家用电器、电脑设备、办公设备、安防设备、手机、耳机等","members" : [{"member_id" : 1,"member_name" : "王大大"},{"member_id" : 2,"member_name" : "王二二"}]}}]}}
3.3 text字段查询(分词:ik_max_word)
3.3.1 测试实例1
搜索business_desc(主营描述信息)包含手机和汽车挂件的文档:采用match(会对检索条件分词)
GET /my-test/shops/_search{"query":{"bool":{"filter":{"match":{"business_desc":"手机和汽车挂件"}}}}}
查询结果:匹配到 2 个文档,因为 business_desc字段定义类型为text类型,使用分词器为ik_max_word,es在保存文档的时候,business_desc的内容会被分词器ik_max_word按照分词规则拆分成:家用、电器、电脑、设备、办公、安防、手机、耳机等词后建立倒排索引。
{"took" : 4,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 2,"max_score" : 0.0,"hits" : [{"_index" : "my-test","_type" : "shops","_id" : "2","_score" : 0.0,"_source" : {"ship_id" : 10000002,"shop_name" : "王大锤的创意小店","business_desc" : "主营:创意品,饰品,房间挂件,汽车挂件","members" : [{"member_id" : 1,"member_name" : "张三"},{"member_id" : 2,"member_name" : "李四"}]}},{"_index" : "my-test","_type" : "shops","_id" : "1","_score" : 0.0,"_source" : {"ship_id" : 10000001,"shop_name" : "王尼玛的电器小店","business_desc" : "主营:家用电器、电脑设备、办公设备、安防设备、手机、耳机等","members" : [{"member_id" : 1,"member_name" : "王大大"},{"member_id" : 2,"member_name" : "王二二"}]}}]}}
3.3.2 测试实例2
搜索business_desc(主营描述信息)包含手机和汽车挂件的文档:term(检索条件不分词)
GET /my-test/shops/_search{"query":{"bool":{"filter":{"term":{"business_desc":"手机和汽车挂件"}}}}}
查询结果:匹配到 0 个文档,因为term查询不会对检索条件进行分词
{"took" : 0,"timed_out" : false,"_shards" : {"total" : 5,"successful" : 5,"skipped" : 0,"failed" : 0},"hits" : {"total" : 0,"max_score" : null,"hits" : [ ]}}
3.4 小结
shop_name 字段的类型是 keyword,ES 在为该字段建立索引的时候,会为整个字段值建立索引,而不会对其进行分词,例如某个店铺名称是“王大锤的创意品小店”,ES 的倒排索引会建立针对“王大锤的创意品小店”的倒排索引。
business_desc 字段类型是 text,ES 在为该字段建立索引时,会首先对字段值进行分词(使用默认的分词器,或用户指定的分词器),例如某个店铺的经营描述是“创意品,饰品,房间挂件,汽车挂件”,ES 会为创意品、饰品、房间、挂件、汽车、挂件等都建立倒排索引
term 将被检索词作为一个名词在指定字段上进行检索(不对被检索词进行分词操作),match 则将被检索词进行分词后在指定字段上进行检索。
