37-2 搜狗搜索结果信息 - 图2

抓取的结果信息包含:

  • 标题
  • 链接
  • 简介
  • 来源

    结果示例图:

    37-2 搜狗搜索结果信息 - 图3

    模板:

    1. {"_id":"sogou-search","startUrl":["http://www.sogou.com/web?query=%E4%BA%A7%E5%93%81%E7%BB%8F%E7%90%86&cid=&s_from=result_up&sut=5575&sst0=1581582252234&lkt=0%2C0%2C0&sugsuv=00414ADC7C73FD2D5B6AC2BFC7A9F330&sugtime=1581582252234&page=[1-5]&ie=utf8&w=01029901&dr=1"],"selectors":[{"id":"info","type":"SelectorElement","parentSelectors":["_root"],"selector":"div.vrwrap","multiple":true,"delay":0},{"id":"title","type":"SelectorLink","parentSelectors":
    2. ["info"],"selector":"h3 a","multiple":false,"delay":0},{"id":"intro","type":"SelectorText","parentSelectors":["info"],"selector":"p","multiple":false,"regex":"","delay":0},{"id":"source","type":"SelectorText","parentSelectors":["info"],"selector":"cite","multiple":false,"regex":"","delay":0}]}

    模板套用步骤:

    (1)进入需要抓取的搜索结果页面,例如:http://www.sogou.com/web?query=%E4%BA%A7%E5%93%81%E7%BB%8F%E7%90%86&cid=&s_from=result_up&sut=5575&sst0=1581582252234&lkt=0%2C0%2C0&sugsuv=00414ADC7C73FD2D5B6AC2BFC7A9F330&sugtime=1581582252234&page=1&ie=utf8&w=01029901&dr=1
    (2)导入模板
    (3)替换 Start URL为要抓取的网页链接(抓取多页需修改 Start URL 里的页码数)
    (4)开始抓取