写了个简单爬虫把安全客上的SRC列表爬了下来
    爬取安全客SRC列表 - 图1
    代码:

    1. import requests
    2. import re
    3. url = "https://www.anquanke.com/src"
    4. def src(url):
    5. res = requests.get(url).content.decode('utf-8')
    6. return res
    7. code = re.findall(r'<a target="_blank" rel="noopener noreferrer" href="/src/(\d+)">',str(src(url)))
    8. for codes in code:
    9. res2 = requests.get("https://www.anquanke.com/src/" + codes).content.decode('utf-8')
    10. srcname = re.findall(r'<title>(.*) - 安全客',res2)
    11. urladdress = re.findall(r'<h2>网址.*href="(.*?)">.*<h2>漏洞提交入口</h2>',res2,flags=re.DOTALL)
    12. str_srcname = "".join(srcname)
    13. str_urladdress = "".join(urladdress)
    14. print(r"名称:{0} , 地址:{1} ".format(str_srcname,str_urladdress))

    效果:
    爬取安全客SRC列表 - 图2
    学到的东西:
    正则“.”匹配时是默认不匹配换行符的,后面加个“flags=re.DOTALL”就好了