获取大众点评店铺示例


方法1

chrome浏览器中查看 获取所需的校验请求信息
image.png
image.png

方法2

使用简单的抓包分析,获取所需的校验信息等, 截图使用的是Fiddler4抓包工具
备注: 此处的抓包明显没有多大意义,因为直接是请求的某个链接,不是所谓的要传递参数的接口链接。
image.png
备注: 复制获取到的请求校验信息到代码中,方式很多。偷懒神器: https://zhuanlan.zhihu.com/p/56447124

大众点评代码示例

  1. import requests
  2. import urllib3
  3. from lxml import etree
  4. cookies = {
  5. 's_ViewType': '10',
  6. '_lxsdk_cuid': '179ad1b34f7c8-0ef2077a853153-f7f1939-1fa400-179ad1b34f7c8',
  7. '_lxsdk': '179ad1b34f7c8-0ef2077a853153-f7f1939-1fa400-179ad1b34f7c8',
  8. '_hc.v': '0b329e3f-dc18-ed51-8bf1-a8cd52e89967.1622106912',
  9. 'ua': 'Vibes',
  10. 'ctu': '5056724c60220d56836e417da179bfba44c1ac0c207dd2a3f5af25fbc965880f',
  11. 'td_cookie': '2312361676',
  12. 'cy': '14',
  13. 'cye': 'fuzhou',
  14. 'dplet': '1e1a7493bd0ecbd1f7ec5b8c6cf2a4a8',
  15. 'dper': '146c52535101e1064cf611627e77de87a136ee2583902a603ec8c7ef790052b0820346eff54949511eb833ac4ebaebdd71a076e91c93439aaad7e41b555be134b5195e113c3fd73a67f790a4694f57a95addd1512b46db91711fe64754deec1a',
  16. 'fspop': 'test',
  17. 'll': '7fd06e815b796be3df069dec7836c3df',
  18. 'Hm_lvt_602b80cf8079ae6591966cc70a3940e7': '1622769563,1623035602,1623130539,1623651956',
  19. 'Hm_lpvt_602b80cf8079ae6591966cc70a3940e7': '1623652129',
  20. '_lxsdk_s': '17a0932b3d9-2fb-6c7-63f^%^7C^%^7C95',
  21. }
  22. headers = {
  23. 'Proxy-Connection': 'keep-alive',
  24. 'Cache-Control': 'max-age=0',
  25. 'Upgrade-Insecure-Requests': '1',
  26. 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.101 Safari/537.36',
  27. 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9',
  28. 'Accept-Language': 'en',
  29. }
  30. if __name__ == '__main__':
  31. # 忽略https安全证书(verify=False)的验证之后, urllib3仍然有安全警告,强制警用即可。
  32. urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
  33. url = 'http://www.dianping.com/fuzhou/ch10/p2'
  34. response = requests.get(url, headers=headers, cookies=cookies, verify=False)
  35. # lxml模块格式化HTML数据
  36. element = etree.HTML(response.text)
  37. # xpath提取所需数据
  38. res_list = element.xpath("//div[@class='shop-list J_shop-list shop-all-list']/ul/li/div[2]//div[@class='tit']/a/@title")
  39. print(len(res_list)) # 15
  40. print(res_list[0]) # 三生石·福建菜(五四路国际大厦店)