官方笔记:
京东搜索案例
1) 搭建环境
WX 狂神说回复 “ElasticSearch” 获得 网页模板.

配置 IndexController, 启动, 查看效果
2) 爬取数据
使用 jsoup 依赖, 爬取网页.
补充
tika依赖, 用来下载内容, 比如音乐, 视频
<!-- jsoup 解析网页 --><dependency><groupId>org.jsoup</groupId><artifactId>jsoup</artifactId><version>1.14.3</version></dependency>
爬取数据通用方法:
public static List<Goods> parseJDGoods(String keyword) {String url = "https://search.jd.com/Search?keyword=";List<Goods> goodsList = new LinkedList<>();try {// 支持中文搜索url += URLEncoder.encode(keyword, "utf-8");// 解析的页面, 类似 js 中的 documentDocument document = Jsoup.parse(new URL(url), 1000 * 10); // 10sElement element = document.getElementById("J_goodsList");assert element != null;Elements liSet = element.getElementsByTag("li");for (Element ele : liSet) {String img = ele.getElementsByTag("img").eq(0).attr("data-lazy-img");String price = ele.getElementsByClass("p-price").eq(0).text();String title = ele.getElementsByClass("p-name").eq(0).text();String shop = ele.getElementsByClass("curr-shop hd-shopname").eq(0).text();goodsList.add(new Goods(title, price, img, shop));}} catch (Exception ex) {throw new RuntimeException("解析 jd 购物页失败: " + ex.getMessage());}return goodsList;}
3) 搜索接口
两个接口:
- 解析 JD 网页, 收集数据
- 提供给用户搜索商品的接口


@Overridepublic List<Map<String, Object>> searchGoods(String keyword, int pageNo, int pageSize) {// 存放搜索结果List<Map<String, Object>> goodsList = new LinkedList<>();if (!StringUtils.hasLength(keyword)) return goodsList;if (pageNo <= 0) pageNo = 1;if (pageSize < 0) pageSize = 0;try {SearchRequest searchRequest = new SearchRequest(ESConstant.JD_ES_GOODS_LIST);// 构建搜索条件SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();// 搜索超时searchSourceBuilder.timeout(TimeValue.timeValueSeconds(10));// // 精确匹配 --> title字段拆分时有时无// TermQueryBuilder termQueryBuilder = QueryBuilders.termQuery("title", keyword);// searchSourceBuilder.query(termQueryBuilder);// match 匹配MatchQueryBuilder matchQueryBuilder = QueryBuilders.matchQuery("title", keyword);searchSourceBuilder.query(matchQueryBuilder);// 分页数据searchSourceBuilder.from(pageNo);searchSourceBuilder.size(pageSize);// 使用搜索条件searchRequest.source(searchSourceBuilder);SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);SearchHit[] hits = searchResponse.getHits().getHits();for (SearchHit hit : hits) {Map<String, Object> map = hit.getSourceAsMap();goodsList.add(map);}} catch (Exception ex) {// TODO}return goodsList;}
4) 前后端分离


前端使用 Vue 和 Axios, 请求后端接口渲染页面



new Vue({el: '#app',data: {keyword: '', // 搜索关键字goodsList: [] // 搜索结果},methods: {searchGoods() {console.log(this.keyword)// 请求数据axios.get('search/' + this.keyword + "/1/15").then(response => {console.log(response)this.goodsList = response.data})}}})
5) 高亮






