抓取的结果信息包含:
- 问题
- 回答
- 用户名字
- 时间
结果示例图:
模板:
{"_id":"amazon-com-question","startUrl":["https://www.amazon.com/ask/questions/asin/B075TCKD2M/ref=ask_dp_dpmw_ql_hza?isAnswered=true"],"selectors":
[{"id":"info","type":"SelectorElement","parentSelectors":["_root","panination"],"selector":".a-section > div.a-spacing-base > div > div.a-col-right","multiple":true,"delay":""},{"id":"question","type":"SelectorText","parentSelectors":
["info"],"selector":".a-spacing-small div.a-col-right","multiple":false,"regex":"","delay":0},{"id":"answer","type":"SelectorText","parentSelectors":
["info"],"selector":".a-col-right > span:nth-of-type(1)","multiple":false,"regex":"","delay":0},
{"id":"user-name","type":"SelectorText","parentSelectors":
["info"],"selector":"span.a-profile-name","multiple":false,"regex":"","delay":0},{"id":"time","type":"SelectorText","parentSelectors":
["info"],"selector":"span.a-color-tertiary","multiple":false,"regex":"","delay":0},{"id":"panination","type":"SelectorLink","parentSelectors":["_root","panination"],"selector":".a-last a","multiple":true,"delay":0}]}
模板套用步骤:
(1)进入需要抓取的商品Customer Questions & Answers 页面,例如:https://www.amazon.com/ask/questions/asin/B075TCKD2M/3/ref=ask_ql_psf_ql_hza?isAnswered=true
(2)导入模板
(3)替换 Start URL为要抓取的网页链接
(4)开始抓取
需要停止时,可以断网。
小提示:
结果信息里的pagination和panination-href是为了翻页作用,可以在excel里面删除。