title: life| 微信读书: 添加总榜 top200 到书架
date: 2020-07-30 23:57:48
解决书荒之自动添加 总榜top200 到书架 — 期待官方提供这个功能
问题是什么
先看看手动要怎么操作:
- 进入 总榜top200 页面
- 点击书籍, 进入书籍页, 点击添加到书架
- 重复上面的过程 200次 (此处可以有表情包)
如果来个杠精说「top200 我已经刷了不少了, 用不着 200 次」:
- 恭喜你, top200 是会更新的
- 你得翻页找你没看过的
- 随着 年代久远, 大概率你是不记得有没有看过的, 需要点进去确认
- 重复此过程 N 次(此处可以有一系列表情包)
很显然, 上面的事情让人来完成, 超级无趣 — 我要的是看书, 你就让我干这个?
未来世界的幸存者: 计算机在自己擅长的领域能远远把人甩在后面, 而且擅长的领域将会越来越多, 好消息是, 它能为我所用.
显然, 重复性的操作, 用计算机(程序, 编程, 工具, 或者其它你 顺手/舒服 的词汇)来解决轻轻松松
怎么解决的
先说结论
- 获取总榜 top200 的 bookid
- 使用 http 请求添加到书架
获取总榜 top200 的 bookid
以下步骤皆基于 chrome 浏览器
- 进入 总榜top200 页面:
[https://r.qq.com/web/category/all](https://r.qq.com/web/category/all)
- 由于页面是 懒加载, 需要不断下拉才能加载全 200 本书
- 你可以不断滚动鼠标
- 推荐: chrome 安装
vimium
插件, 按键G
就可以下滑到页面底部, 重复直到加载完即可 - 也可以 f12 打开 devtool, 进入到 console, 不断执行
window.scrollTo(0, 100000)
(第二个参数设置个大一些的值, 确保一直能滚动到页尾即可)
在 console 中执行这段 js, 获取 总榜top200 的所有 bookid
var a = new Array();
for (let i = 1; i <= 200; i++) {
var sel = "li[maxidx='" + i + "']"
var id = '"' + $(sel).getAttribute('bookid') + '"';
a[i-1] = id;
}
a.toString();
结果如下, 复制备用:
"695233","822995","22946457","316606","812443","834525","855812","853116","139417","674048","933334","277781","847310","23444142","534288","23549144","913003","139418","837932","843353","827628","932430","932428","812778","675678","23774475","918478","853107","847050","805127","834464","24965201","230107","858626","798755","546339","216212","22355133","182489","935536","923092","832038","814447","834465","23601930","917268","522205","26713355","174682","840995","814146","922455","22910727","23233558","25615385","856239","921568","823426","721367","25131764","314203","853892","852290","164524","316612","695126","855893","757874","25504039","703157","912238","908604","928520","31165402","921090","818969","651366","164525","837618","22791651","854572","804558","854001","23484983","674044","932429","840919","851459","567661","23303684","921759","22806930","23523115","827629","758957","932434","825681","25622039","927231","23350287","925034","3000000059","698397","23640906","30552174","839223","815124","26573446","834083","635948","23485007","854575","26605831","217327","573975","130879","26629836","931848","25071495","809763","650985","854550","22247775","840760","25445341","26087714","649970","914582","25306682","854928","381958","815156","917553","825576","921079","30068012","24129813","861002","728514","23021280","660048","23722718","22791707","910419","857527","26078972","130326","237732","23723812","838779","821609","23642580","22661836","821600","26454163","508199","823947","821606","620610","858485","431292","24240708","821604","30577328","26435429","23056039","3000757926","23921732","920939","32052185","589453","814415","922228","27058200","25848976","814398","933636","29196155","22910596","31617666","28221039","926152","600142","27754180","32052884","26435427","29750244","32052187","821598","674073","908793","23433628","25445395","31485684","813357","859919","26796443","750578","920661","26435421"
使用 http 请求添加到书架
- 点击一本书籍, 进入书籍首页, 可以看到
加入书架
按钮- 这个按钮可以在 console 下执行
$('.bookInfo_more_addShelf').click()
, 和页面上点击的效果一样
- 这个按钮可以在 console 下执行
- 切换到 devtool 的 network 页, 可以看到一个名字为
addToShelf
的 XHR 请求, 点进去查看请求的详细信息- 在请求上右键, 还有一个好用的功能:
XHR replay
, 调试接口时很有用
- 在请求上右键, 还有一个好用的功能:
- 接下来轮到发起 http 请求的神器来登场了:
vscode + rest client 插件
直接上代码:
# test.http, rest client 插件识别的文件后缀为 .http
# 使用 ### 开头的是一段完整的 http 请求定义, 光标/鼠标 悬停会显示 `send request`, 点击它发起请求
###
# http method 和 url, 默认值为 GET http
POST https://weread.qq.com/mp/shelf/addToShelf
# 紧跟着是 http request header, 键值对, 有代码自动提示; 这个接口只需要下面这些 header
Content-Type: application/json
cookie: cookie 会过期, 遇到 response 提示过期了, 查看一个新的 XHR 请求, 复制里面的 cookie 过来即可
user-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36
# request body 在这里, json 格式, 复制上一步获取的 bookid
{"bookIds":["695233","822995","22946457","316606","812443","834525","855812","853116","139417","674048","933334","277781","847310","23444142","534288","23549144","913003","139418","837932","843353","827628","932430","932428","812778","675678","23774475","918478","853107","847050","805127","834464","24965201","230107","858626","798755","546339","216212","22355133","182489","935536","923092","832038","814447","834465","23601930","917268","522205","26713355","174682","840995","814146","922455","22910727","23233558","25615385","856239","921568","823426","721367","25131764","314203","853892","852290","164524","316612","695126","855893","757874","25504039","703157","912238","908604","928520","31165402","921090","818969","651366","164525","837618","22791651","854572","804558","854001","23484983","674044","932429","840919","851459","567661","23303684","921759","22806930","23523115","827629","758957","932434","825681","25622039","927231","23350287","925034","3000000059","698397","23640906","30552174","839223","815124","26573446","834083","635948","23485007","854575","26605831","217327","573975","130879","26629836","931848","25071495","809763","650985","854550","22247775","840760","25445341","26087714","649970","914582","25306682","854928","381958","815156","917553","825576","921079","30068012","24129813","861002","728514","23021280","660048","23722718","22791707","910419","857527","26078972","130326","237732","23723812","838779","821609","23642580","22661836","821600","26454163","508199","823947","821606","620610","858485","431292","24240708","821604","30577328","26435429","23056039","3000757926","23921732","920939","32052185","589453","814415","922228","27058200","25848976","814398","933636","29196155","22910596","31617666","28221039","926152","600142","27754180","32052884","26435427","29750244","32052187","821598","674073","908793","23433628","25445395","31485684","813357","859919","26796443","750578","920661","26435421"]}
踩到了哪些坑
坑
是程序员黑话 或者 习惯性用语, 其实用来自嘲 — 解决问题的过程中走了哪些弯路
大的解决问题方向性错误: 简单使用机器取代人的操作
大部分浪费掉的时间都是因为这个方向性的错误, 而且这个错误非常非常的常见 — 简单 的使用机器模拟人的 重复性 工具
这样 头脑过于简单
的解决方案, 踩到的坑真的是一言难尽:
- 目标:
点击书籍, 进入书籍页, 点击添加到书架
, 重复此步骤 200 次 - 需要多次访问
总榜 top200
页面, 每次访问都会遇到 懒加载 的问题 - 为了回避这个问题, 设法获取到每个书籍的 url, 也就是
$("li[maxidx='1'] a").href
获取到地址 - 但是这又会有新的问题: 需要不断加载新页面
- 同时微信读书也做了一些限制, 开启 devtool 访问页面时, 不允许加载页面内容, 导致 console 下的调试不顺利
anyway, 在错误道路上做的所有尝试, 最后都会变成恍然大悟后糊在脸上的大巴掌
人有人的局限性, 不要因此而局限了机器的可能
过度追求自动化
上面解决 总榜top200 页面 懒加载 问题时, 列举了 3 个解决方式, 并不是 手上不只有锤子, 十八般武器都要拉出来秀一秀, 而是在这里陷入了一个常见的陷阱: 过度追求自动化, 花了大量的时间想要实现自动化加载这个页面的所有内容, 不断尝试, 不断失败, 花了大量时间, 但是回过头来看一下:
最终目标是 top200 加入到书架, 拿到 top200 书籍的信息, 就可以进行下一步了, 完全没必要卡在这个环节上
如果看过 「凤凰项目」 这本书, 一定会对这个场景印象特别深 — 年轻的 CIO(IT运维副总裁), 遇到瓶颈了, 去公司的投资人寻找帮助, 这位资深人士带他去了一趟工厂, 在高处的平台往下看, 物料堆积的环节非常明显.
我们自身其实也拥有很多 物料, 做事情也会遇到很多 环节, 遇到瓶颈了, 也要学会从高处看一看
写在最后
解决问题, 甚至这个问题可能暂时只有我解决过 — 这种感觉是如此的舒畅, 这也是程序员喜欢 show the code 的原因
参考的内容
- https://github.com/DoooReyn/WxRead-WebAutoReader: 这个项目还可以刷 微信读书时长, 更重要的是这样项目的存在, 会坚定你 这个问题可以解决 的信心
- 菜鸟教程里的 js/jQuery 教程, 平时写 js/jQuery 比较少, 遇到忘掉的 API 就过来翻翻, 真香
代码是美的!
因为我会在 豆瓣 上 mark 下, 有时豆瓣上面的书没有封面, 我就喜欢用微信读书里的, 举的例子就是怎么获取微信读书封面
- 这是最开始的一版, 直接使用 devtool 中 element 页下定位到标签获取到的, 简直没法看
document.querySelector("#routerView > div.app_content > div.readerBookInfo > div.readerBookInfo_head > div.wr_bookCover.bookInfo_cover > img").src
- 其实微信读书是支持 jQuery 语法的
$('.bookInfo_cover > img').src
微信读书新增的功能
- 终于支持 电脑/本地 传书到手机啦, 支持 txt/epub
- 2个短网址
r.qq.com
: PC 版ink.qq.com
: 墨水屏版