我们这次用Python下载优酷中的视频,以《名侦探柯南》为例,首分析下优酷视频的请求方式
可以看出,优酷中的视频是以“m3u8+ts”的形式展现的,我们用you-get尝试一下看能不能正常下载:
you-get https://v.youku.com/v_show/id_XMzk1NjM1MjAw.html\?spm\=a2hcb.12675304.m_7182_c_14738.d_4\&s\=cc003400962411de83b1\&scm\=20140719.rcmd.7182.show_cc003400962411de83b1
完美,可以正常下载并播放,说明优酷对公开视频没做加密处理(会员视频不知道有没有做加密处理,以后尝试),可以很容易地下载
以下是这个地址提取出的m3u8地址:
https://valipl.cp31.ott.cibntv.net/657344945574371BC95CD47EE/03000500005C87988D10F87011BA6A594ECA8A-95C1-4693-A925-8DA0F67B40B0.m3u8?ccode=0502&duration=1493&expire=18000&psid=f20c7e3b28c90f13fdc97bd835a4c2bc&ups_client_netip=7435f5f6&ups_ts=1589254536&ups_userid=&utid=G97wFuWSrCYCAXBwJ%2Bq2XrB6&vid=XMzk1NjM1MjAw&vkey=B10fdf26c2aed124113199eff1351e1d4&sm=1&operate_type=1&dre=u37&si=73&eo=0&dst=1&iv=0&s=cc003400962411de83b1&type=flvhdv3&bc=2
m3u8格式的内容如下:
#EXTM3U
#EXT-X-VERSION:3
#EXT-X-TARGETDURATION:10
#EXT-X-MEDIA-SEQUENCE:0
#EXTINF:10.000000,
#EXT-X-PRIVINF:FILESIZE=233684
https://valipl.cp31.ott.cibntv.net/67756D6080932713CFC02204E/03000500005C87988D10F87011BA6A594ECA8A-95C1-4693-A925-8DA0F67B40B0-00001.ts?ccode=0502&duration=1493&expire=18000&psid=f20c7e3b28c90f13fdc97bd835a4c2bc&ups_client_netip=7435f5f6&ups_ts=1589254536&ups_userid=&utid=G97wFuWSrCYCAXBwJ%2Bq2XrB6&vid=XMzk1NjM1MjAw&sm=1&operate_type=1&dre=u37&si=73&eo=0&dst=1&iv=0&s=cc003400962411de83b1&type=flvhdv3&bc=2&vkey=B31b72434dad3d41e0ee875622526236a
#EXTINF:10.000000,
#EXT-X-PRIVINF:FILESIZE=332948
https://valipl.cp31.ott.cibntv.net/67756D6080932713CFC02204E/03000500005C87988D10F87011BA6A594ECA8A-95C1-4693-A925-8DA0F67B40B0-00002.ts?ccode=0502&duration=1493&expire=18000&psid=f20c7e3b28c90f13fdc97bd835a4c2bc&ups_client_netip=7435f5f6&ups_ts=1589254536&ups_userid=&utid=G97wFuWSrCYCAXBwJ%2Bq2XrB6&vid=XMzk1NjM1MjAw&sm=1&operate_type=1&dre=u37&si=73&eo=0&dst=1&iv=0&s=cc003400962411de83b1&type=flvhdv3&bc=2&vkey=Bf6e771a551f95093ab43a74104c9f429
...
#EXTINF:3.000000,
#EXT-X-PRIVINF:FILESIZE=37600
https://valipl.cp31.ott.cibntv.net/67756D6080932713CFC02204E/03000500005C87988D10F87011BA6A594ECA8A-95C1-4693-A925-8DA0F67B40B0-00150.ts?ccode=0502&duration=1493&expire=18000&psid=f20c7e3b28c90f13fdc97bd835a4c2bc&ups_client_netip=7435f5f6&ups_ts=1589254536&ups_userid=&utid=G97wFuWSrCYCAXBwJ%2Bq2XrB6&vid=XMzk1NjM1MjAw&sm=1&operate_type=1&dre=u37&si=73&eo=0&dst=1&iv=0&s=cc003400962411de83b1&type=flvhdv3&bc=2&vkey=B7326e6d22abf323fdabac4758774398c
#EXT-X-ENDLIST
m3u8文件示例:
03000500005C87988D10F87011BA6A594ECA8A-95C1-4693-A925-8DA0F67B40B0.m3u8
可以看出,就是一个播放列表,每一个视频分段是一个ts文件。
注:关于you-get的安装(我是在WSL中安装的),使用以下命令:
pip3 install you-get
you-get的详细使用文档参见:you-get 中文说明
使用Python直接调用you-get进行下载
you-get比较常见的用法是在命令行中使用,但是如果要在python中使用,可以这样来实现:
# -*- coding: utf-8 -*-
from you_get import common
url = 'https://v.youku.com/v_show/id_XMzk1NjM1MjAw.html\?spm\=a2hcb.12675304.m_7182_c_14738.d_4\&s\=cc003400962411de83b1\&scm\=20140719.rcmd.7182.show_cc003400962411de83b1'
common.any_download(
url=url,
output_dir=r'C:\Users\quanzaiyu\Desktop\temp',
merge=True
)
使用Python自己实现
如果想要知道m3u8的原理,最好还是自己手动撸一遍代码比较好,万变不离其宗,关键是掌握技巧:
# -*- coding: utf-8 -*-
import os
import requests
def download(url):
download_path = os.getcwd() + "\download"
if not os.path.exists(download_path):
os.mkdir(download_path)
all_content = requests.get(url).text # 获取M3U8的文件内容
file_line = all_content.split("\n") # 读取文件里的每一行
# 通过判断文件头来确定是否是M3U8文件
if file_line[0] != "#EXTM3U":
raise BaseException(u"非M3U8的链接")
else:
for index, line in enumerate(file_line):
if "https" in line:
res = requests.get(line)
with open(download_path + "\\云霄飞车杀人事件.mp4", 'ab') as f:
f.write(res.content)
f.flush()
print("下载完成")
if __name__ == '__main__':
url = "https://valipl.cp31.ott.cibntv.net/657344945574371BC95CD47EE/03000500005C87988D10F87011BA6A594ECA8A-95C1-4693-A925-8DA0F67B40B0.m3u8?ccode=0502&duration=1493&expire=18000&psid=f20c7e3b28c90f13fdc97bd835a4c2bc&ups_client_netip=7435f5f6&ups_ts=1589254536&ups_userid=&utid=G97wFuWSrCYCAXBwJ%2Bq2XrB6&vid=XMzk1NjM1MjAw&vkey=B10fdf26c2aed124113199eff1351e1d4&sm=1&operate_type=1&dre=u37&si=73&eo=0&dst=1&iv=0&s=cc003400962411de83b1&type=flvhdv3&bc=2"
download(url)
加密的m3u8视频
TODO…