elasticSearch

elasticSearch
倒排索引
索引库：
RestClient

elasticSearch是一个开源的高性能分布式搜索引擎，可以帮助我们在海量数据中快速查找对应信息。

强大的开源的分布式搜索引擎
- 实现搜索、日志统计、分析、系统监控等功能
- 帮助我们从海量数据快速查找需要的内容
底层是基于lucene来实现
面向文档存储
数据序列化为json
索引：相同类型的文档的集合

什么是elasticsearch？

一个开源的分布式搜索引擎，可以用来实现搜索、日志统计、分析、系统监控等功能

什么是elastic stack（ELK）？

是以elasticsearch为核心的技术栈，包括beats、Logstash、kibana、elasticsearch

什么是Lucene？

是Apache的开源搜索引擎类库，提供了搜索引擎的核心API

倒排索引
正向索引：就是通过id查询，得到每个文档，再通过文档去查询词条
倒排索引：通过文档词条查询，获得对应id，再通过id去查询对应文档
文档：数据库中每一条数据就是一条文档，就是每一行
词条：将文档可以被搜索的语句，利用某种算法进行分词，得到的具备含义的词语就是词条。
例如：我是中国人，就可以分为：我、是、中国人、中国、国人这样的几个词条

索引库：
索引库就类似数据库表，mapping映射就类似表的结构。
我们要向es中存储数据，必须先创建“库”和“表”。

mapping映射属性
mapping是对索引库中文档的约束，常见的mapping属性包括：
type：字段数据类型，常见的简单类型有：
- 字符串：text（可分词的文本）、keyword（精确值，例如：品牌、国家、ip地址）
- 数值：long、integer、short、byte、double、float、
- 布尔：boolean
- 日期：date
- 对象：object
index：是否创建索引，默认为true
analyzer：使用哪种分词器
properties：该字段的子字段

基本操作：

创建索引库：PUT /索引库名

PUT /haha
{
"mappings": {
  "properties": {
    "info":{
      "type": "text",
      "analyzer": "ik_smart"
    },
    "email":{
      "type": "keyword",
      "index": "falsae"
    },
    "name":{
      "properties": {
        "firstName": {
          "type": "keyword"
        }
      }
    },
    // ... 略
  }
}
}

查询索引库：GET /索引库名
删除索引库：DELETE /索引库名
添加字段：PUT /索引库名/_mapping

文档操作：
创建文档：POST /{索引库名}/_doc/文档id { json文档 }
查询文档：GET /{索引库名}/_doc/文档id
- 批量查询：GET /hotel/_search
删除文档：DELETE /{索引库名}/_doc/文档id
修改文档：
- 全量修改：PUT /{索引库名}/_doc/文档id { json文档 }
- 增量修改：POST /{索引库名}/_update/文档id { “doc”: {字段}}
  RestClient
  初始化RestClient
  初始化RestClient，建立与elasticsearch的连接。
  分为三步：

引入es的RestHighLevelClient依赖：

<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
</dependency>

因为SpringBoot默认的ES版本是7.6.2，所以我们需要覆盖默认的ES版本：

<properties>
<java.version>1.8</java.version>
<elasticsearch.version>7.12.1</elasticsearch.version>
</properties>

初始化RestHighLevelClient：

RestHighLevelClient client = new RestHighLevelClient(RestClient.builder(
 HttpHost.create("http://虚拟机ip:9200")
));

操作代码

@Test
void createHotelIndex() throws IOException {
 // 1.创建Request对象
 CreateIndexRequest request = new CreateIndexRequest("hotel");
 // 2.准备请求的参数：DSL语句
 request.source(MAPPING_TEMPLATE, XContentType.JSON);
 // 3.发送请求
 client.indices().create(request, RequestOptions.DEFAULT);
}

索引库操作的基本步骤：

初始化RestHighLevelClient
创建XxxIndexRequest。XXX是Create、Get、Delete
准备DSL（ Create时需要，其它是无参）
发送请求。调用RestHighLevelClient#indices().xxx()方法，xxx是create、exists、delete

基本操作：
创建索引库：PUT /索引库名
查询索引库：GET /索引库名
删除索引库：DELETE /索引库名
添加字段：PUT /索引库名/_mapping ```java /**

创建索引库。 *

@throws IOException */ @Test void createHotelIndex() throws IOException { //1.创建request对象 CreateIndexRequest request = new CreateIndexRequest(“hotel”); //2.准备请求参数:DSL语句 request.source(MAPPING_TEMPLATE, XContentType.JSON); //3.发送请求 client.indices().create(request, RequestOptions.DEFAULT); }

```java
void deleteHotelIndex() throws IOException {
      //1.创建request对象
      DeleteIndexRequest request = new DeleteIndexRequest("hotel");
      //2.发送请求
      client.indices().delete(request, RequestOptions.DEFAULT);
  }

void existsHotelIndex() throws IOException {
      //1.创建request对象
      GetIndexRequest request = new GetIndexRequest("hotel");
      //2.发送请求
      boolean flag = client.indices().exists(request, RequestOptions.DEFAULT);
      //3.输出
      System.out.println(flag);
  }

文档操作的基本步骤：

初始化RestHighLevelClient
创建XxxRequest。XXX是Index、Get、Update、Delete、Bulk
准备参数（Index、Update、Bulk时需要）
发送请求。调用RestHighLevelClient#.xxx()方法，xxx是index、get、update、delete、bulk
解析结果（Get时需要）

操作代码
先发送请求，根据提示来写参数

发送添加请求（index）->需要request参数->准备数据->查询文档类型->准备对象->发送请求——>

@Test
void testAddDocument() throws IOException {
  // 1.根据id查询酒店数据
  Hotel hotel = hotelService.getById(61083L);
  // 2.转换为文档类型
  HotelDoc hotelDoc = new HotelDoc(hotel);
  // 3.将HotelDoc转json
  String json = JSON.toJSONString(hotelDoc);

  // 1.准备Request对象
  IndexRequest request = new IndexRequest("hotel").id(hotelDoc.getId().toString());
  // 2.准备Json文档
  request.source(json, XContentType.JSON);
  // 3.发送请求
  client.index(request, RequestOptions.DEFAULT);
}

@Test
void testGetDocumentById() throws IOException {
  // 1.准备Request
  GetRequest request = new GetRequest("hotel", "61082");
  // 2.发送请求，得到响应
  GetResponse response = client.get(request, RequestOptions.DEFAULT);
  // 3.解析响应结果
  String json = response.getSourceAsString();

  HotelDoc hotelDoc = JSON.parseObject(json, HotelDoc.class);
  System.out.println(hotelDoc);
}

@Test
void testUpdateDocument() throws IOException {
  // 1.准备Request
  UpdateRequest request = new UpdateRequest("hotel", "61083");
  // 2.准备请求参数
  request.doc(
      "price", "952",
      "starName", "四钻"
  );
  // 3.发送请求
  client.update(request, RequestOptions.DEFAULT);
}

@Test
void testDeleteDocument() throws IOException {
  // 1.准备Request
  DeleteRequest request = new DeleteRequest("hotel", "61083");
  // 2.发送请求
  client.delete(request, RequestOptions.DEFAULT);
}

@Test
void testBulkRequest() throws IOException {
  // 批量查询酒店数据
  List<Hotel> hotels = hotelService.list();

  // 1.创建Request
  BulkRequest request = new BulkRequest();
  // 2.准备参数，添加多个新增的Request
  for (Hotel hotel : hotels) {
      // 2.1.转换为文档类型HotelDoc
      HotelDoc hotelDoc = new HotelDoc(hotel);
      // 2.2.创建新增文档的Request对象
      request.add(new IndexRequest("hotel")
                  .id(hotelDoc.getId().toString())
                  .source(JSON.toJSONString(hotelDoc), XContentType.JSON));
  }
  // 3.发送请求
  client.bulk(request, RequestOptions.DEFAULT);
}

Redis、Mysql、ES应用场景

Mysql：事务性事件，确保数据的安全一致性。
Redis：热点数据，高并发场景
ES：海量数据搜索，分析，统计

优化索引创建：

优化百分之20-30

发送添加请求（index）

elasticSearch

elasticSearch

倒排索引

索引库：

mapping映射属性

基本操作：

文档操作：

RestClient

初始化RestClient

初始化RestClient，建立与elasticsearch的连接。

操作代码

索引库操作的基本步骤：

基本操作：

文档操作的基本步骤：

操作代码

Redis、Mysql、ES应用场景

优化索引创建：