![]() |
|---|
| © getcodify.com |
由於語法渲染問題而影響閱讀體驗, 請移步博客閱讀~
本文GitPage地址
Crawler (爬虫)
1. Quick Start
Crawler img location sites from National Geographic web site & downloading them.
from bs4 import BeautifulSoupfrom urllib.request import urlopenimport reimport requests## Starting resuqesthtml = urlopen("http://www.nationalgeographic.com.cn/animals/").read().decode('utf-8')soup = BeautifulSoup(html, features='lxml')img_links = soup.find_all("img", {"src": re.compile('http://image..*?\.jpg')})for link in img_links:print(link['src']) # pic locationg

## With adding this## mkdir img # 创建一个img文件夹for link in img_links:print(link['src'])if link['src'][0:4] == 'http':url = link['src']r = requests.get(url, stream=True)image_name = url.split('/')[-1]with open('./img/%s' % image_name, 'wb') as f:for chunk in r.iter_content(chunk_size=128):f.write(chunk)print('Saved %s' % image_name)
Running result:

实战案例:
科技快讯
Enjoy~
由於語法渲染問題而影響閱讀體驗, 請移步博客閱讀~
本文GitPage地址
GitHub: Karobben
Blog:Karobben
BiliBili:史上最不正經的生物狗

