项目目录:
第一步:创建项目
scrapy startproject pearVideo
第二步:
cd pearVideo
scrapy genspider pearVideos "www.pearvideo.com/category_6"
pearVideos.py文件
import scrapy
from pearVideo.items import PearvideoItem
class PearvideosSpider(scrapy.Spider):
name = 'pearVideos'
allowed_domains = ['www.pearvideo.com']
start_urls = ['https://www.pearvideo.com/category_6']
def parse(self, response):
li_list = response.xpath('//ul[@id="listvideoListUl"]/li')
for li in li_list:
item = PearvideoItem()
item['title'] = li.xpath('./div/a//div[@class="vervideo-title"]/text()').get()
item['author'] = li.xpath('.//a[@class="column"]/text()').get()
item['nums'] = li.xpath('.//span[@class="fav"]/text()').get()
print(item)
yield item
items.py文件
class PearvideoItem(scrapy.Item):
title = scrapy.Field()
author = scrapy.Field()
nums = scrapy.Field()
run.py 运行项目文件
from scrapy import cmdline
# 执行项目
#cmdline.execute('scrapy crawl pearVideos -o pearvideo.csv'.split())
cmdline.execute('scrapy crawl pearVideos'.split())