内容简介
结果示例图:
模板:
{"_id":"dedao-dianzishu1","startUrl":["https://www.igetget.com/list/%E5%BF%83%E7%90%86%E5%AD%A6/YEZtmS0fghz"],"selectors":[{"id":"name","type":"SelectorLink","parentSelectors":["info"],"selector":"a","multiple":true,"delay":0},{"id":"author","type":"SelectorText","parentSelectors":
["name"],"selector":".head-text h3","multiple":false,"regex":"","delay":0},{"id":"time","type":"SelectorText","parentSelectors":
["name"],"selector":".head-text p.pro-common","multiple":false,"regex":"","delay":0},{"id":"price","type":"SelectorText","parentSelectors":
["name"],"selector":"span.coin-num","multiple":false,"regex":"","delay":0},{"id":"intro1","type":"SelectorText","parentSelectors":
["name"],"selector":"p.intro-content:nth-of-type(1)","multiple":false,"regex":"","delay":0},{"id":"info","type":"SelectorElementClick","parentSelectors":
["_root"],"selector":"li.pro-detail","multiple":true,"delay":"2000","clickElementSelector":"div.page-num-div:nth-of-type(n+2)","clickType":"clickOnce","discardInitialElements":"do-not-discard","clickElementUniquenessType":"uniqueText"},{"id":"intro2","type":"SelectorText","parentSelectors":
["name"],"selector":"p.intro-content:nth-of-type(2)","multiple":false,"regex":"","delay":0}]}
模板套用步骤:
(1)进入需要抓取的电子书分类页面,例如:https://www.igetget.com/list/%E5%BF%83%E7%90%86%E5%AD%A6/YEZtmS0fghz
(2)导入模板
(3)替换 Start URL为要抓取的网页链接
(4)开始抓取