‘’’
    作业1(2022-3-26)
    目标网站 https://www.kugou.com/yy/html/rank.html
    爬取要求:
    1)获取到榜单页面的源码
    2)用正则解析数据,获取到该页面所有歌曲的名字(包括歌手)和页面链接
    3)把数据保存到csv
    ‘’’

    1. import csv
    2. import requests
    3. import re
    4. url = 'https://www.kugou.com/yy/html/rank.html'
    5. # headers = {
    6. # 'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.51 Safari/537.36',
    7. # 'cookie': 'kg_mid=c0814cdf9ee4c9df77b0e0d2486cb007; kg_dfid=3Mo1rO4YVm0Q0MK8bX1DCht4; KuGooRandom=66451647516298467; kg_mid_temp=c0814cdf9ee4c9df77b0e0d2486cb007; kg_dfid_collect=d41d8cd98f00b204e9800998ecf8427e; ACK_SERVER_10015=%7B%22list%22%3A%5B%5B%22bjlogin-user.kugou.com%22%5D%5D%7D; ACK_SERVER_10016=%7B%22list%22%3A%5B%5B%22bjreg-user.kugou.com%22%5D%5D%7D; ACK_SERVER_10017=%7B%22list%22%3A%5B%5B%22bjverifycode.service.kugou.com%22%5D%5D%7D; Hm_lvt_aedee6983d4cfc62f509129360d6bb3d=1647516298,1648307473; Hm_lpvt_aedee6983d4cfc62f509129360d6bb3d=1648307548'
    8. # }
    9. # 哈哈,header也不用
    10. res = requests.get(url)
    11. html = res.text
    12. result = re.match(r'.*(<div class="pc_temp_songlist pc_rank_songlist_short">.*?</div>).*',html,re.S)
    13. ul = result.group(1)
    14. lis = re.findall(r'<li.*?>.*?</li>', ul, re.S)
    15. pattern = re.compile(r'<li.*?href="(.*?)".*?title="(.*?)".*?</li>', re.S)
    16. data = []
    17. for i in lis:
    18. r = pattern.match(i)
    19. l1 = r.group(2).split('-')
    20. if len(l1) > 2:
    21. auther = '-'.join(l1[0:-1])
    22. else:
    23. auther = l1[0]
    24. song = l1[-1]
    25. link = r.group(1)
    26. data.append((song, auther, link))
    27. with open("song_list.csv", "w", encoding='utf-8', newline='') as f:
    28. writer = csv.writer(f)
    29. writer.writerow(["歌名", "歌手", "链接"]) #写表头
    30. writer.writerows(data)

    附文件内容:

    1. 歌名,歌手,链接
    2. 追光旅行,井迪儿 ,https://www.kugou.com/mixsong/6h8lsc44.html
    3. 晚风心里吹,阿梨粤 ,https://www.kugou.com/mixsong/6g3al1ce.html
    4. Bet On Me,Walk Off the EarthD Smoke ,https://www.kugou.com/mixsong/6d786s37.html
    5. 如果我是他,王不醒 ,https://www.kugou.com/mixsong/6g7rqw07.html
    6. 像极了,永彬Ryan.B ,https://www.kugou.com/mixsong/3vn8ode7.html
    7. 调查中,糯米Nomi ,https://www.kugou.com/mixsong/6bwjtz7e.html
    8. Time to Pretend (伪装时刻),Lazer Boomerang ,https://www.kugou.com/mixsong/3j9z7le8.html
    9. Lose Control (Explicit),Hedley ,https://www.kugou.com/mixsong/47zgnod6.html
    10. 最美的瞬间,真瑞 ,https://www.kugou.com/mixsong/4hj34t74.html
    11. 就忘了吧,1K ,https://www.kugou.com/mixsong/6dd12c39.html
    12. 带我去找夜生活,告五人 ,https://www.kugou.com/mixsong/41nw8x64.html
    13. 天若有情,A-Lin ,https://www.kugou.com/mixsong/nw721d3.html
    14. 起风了,买辣椒也用券 ,https://www.kugou.com/mixsong/4igk4d9f.html
    15. Wake (Studio Version),Hillsong Young &amp; Free ,https://www.kugou.com/mixsong/hecel90.html
    16. 一吻天荒 (热血版),阿禹ayy ,https://www.kugou.com/mixsong/6gch262d.html
    17. Normal No More (Explicit),Tysm ,https://www.kugou.com/mixsong/46zyql8f.html
    18. 美人鱼 (女声版),夏奈 ,https://www.kugou.com/mixsong/4r50lu4b.html
    19. 玫瑰窃贼,柳爽 ,https://www.kugou.com/mixsong/5q84sc85.html
    20. 月光不答,Y-D、闭文思 ,https://www.kugou.com/mixsong/6c73hf60.html
    21. 剑魂 (鱼多余版),鱼多余呀 ,https://www.kugou.com/mixsong/6crp8pa8.html
    22. 阿拉斯加海湾,蓝心羽 ,https://www.kugou.com/mixsong/4p4uihc1.html
    23. 春泥 (女版),旺仔小乔 ,https://www.kugou.com/mixsong/6fchtaa8.html