26-5 amazon.com 某商品Customer Questions & Answers - 图1

抓取的结果信息包含:

  • 问题
  • 回答
  • 用户名字
  • 时间

    结果示例图:

    26-5 amazon.com 某商品Customer Questions & Answers - 图2

    模板:

    1. {"_id":"amazon-com-question","startUrl":["https://www.amazon.com/ask/questions/asin/B075TCKD2M/ref=ask_dp_dpmw_ql_hza?isAnswered=true"],"selectors":
    2. [{"id":"info","type":"SelectorElement","parentSelectors":["_root","panination"],"selector":".a-section > div.a-spacing-base > div > div.a-col-right","multiple":true,"delay":""},{"id":"question","type":"SelectorText","parentSelectors":
    3. ["info"],"selector":".a-spacing-small div.a-col-right","multiple":false,"regex":"","delay":0},{"id":"answer","type":"SelectorText","parentSelectors":
    4. ["info"],"selector":".a-col-right > span:nth-of-type(1)","multiple":false,"regex":"","delay":0},
    5. {"id":"user-name","type":"SelectorText","parentSelectors":
    6. ["info"],"selector":"span.a-profile-name","multiple":false,"regex":"","delay":0},{"id":"time","type":"SelectorText","parentSelectors":
    7. ["info"],"selector":"span.a-color-tertiary","multiple":false,"regex":"","delay":0},{"id":"panination","type":"SelectorLink","parentSelectors":["_root","panination"],"selector":".a-last a","multiple":true,"delay":0}]}

    模板套用步骤:

    (1)进入需要抓取的商品Customer Questions & Answers 页面,例如:https://www.amazon.com/ask/questions/asin/B075TCKD2M/3/ref=ask_ql_psf_ql_hza?isAnswered=true
    (2)导入模板
    (3)替换 Start URL为要抓取的网页链接
    (4)开始抓取

需要停止时,可以断网。

小提示:
结果信息里的paginationpanination-href是为了翻页作用,可以在excel里面删除。