利用 Python-user-agents 解析 User_Agent

需求分析

近期在尝试做一个登录日志的功能,及用户登录成功后我在后台进行一个用户的登录记录,两种解决方案:

  1. 由前端得到用户的手机型号,我在后台接收后在数据库进行保存
  2. 使用User_Agent, 它通过解析(浏览器/HTTP) user agent 字符串,提供了一种简单的方法,来识别/检测手机、平板等设备及其功能。目标是可靠地检测:设备是手机,平板还是电脑;是否有触摸屏。

用法

各种基本信息可以帮忙识别访问者,比如设备,操作系统,浏览器等属性

  1. from user_agents import parse
  2. # iPhone's user agent string
  3. ua_string = 'Mozilla/5.0 (iPhone; CPU iPhone OS 5_1 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9B179 Safari/7534.48.3'
  4. user_agent = parse(ua_string) # 解析成user_agent
  5. # Accessing user agent's browser attributes
  6. user_agent.browser # returns Browser(family=u'Mobile Safari', version=(5, 1), version_string='5.1')
  7. user_agent.browser.family # returns 'Mobile Safari'
  8. user_agent.browser.version # returns (5, 1)
  9. user_agent.browser.version_string # returns '5.1'
  10. # Accessing user agent's operating system properties
  11. user_agent.os # returns OperatingSystem(family=u'iOS', version=(5, 1), version_string='5.1')
  12. user_agent.os.family # returns 'iOS'
  13. user_agent.os.version # returns (5, 1)
  14. user_agent.os.version_string # returns '5.1'
  15. # Accessing user agent's device properties
  16. user_agent.device # returns Device(family=u'iPhone', brand=u'Apple', model=u'iPhone')
  17. user_agent.device.family # returns 'iPhone'
  18. user_agent.device.brand # returns 'Apple'
  19. user_agent.device.model # returns 'iPhone'
  20. # Viewing a pretty string version
  21. str(user_agent) # returns "iPhone / iOS 5.1 / Mobile Safari 5.1"
  22. # 最后这个最好用

目前还支持这些属性:

  • is_mobile:判断是不是手机
  • is_tablet:判断是不是平板
  • is_pc:判断是不是电脑
  • is_touch_capable:有没有触屏功能
  • is_bot:是不是搜索引擎的爬虫
  1. from user_agents import parse
  2. # Let's start from an old, non touch Blackberry device
  3. ua_string = 'BlackBerry9700/5.0.0.862 Profile/MIDP-2.1 Configuration/CLDC-1.1 VendorID/331 UNTRUSTED/1.0 3gpp-gba'
  4. user_agent = parse(ua_string)
  5. user_agent.is_mobile # returns True
  6. user_agent.is_tablet # returns False
  7. user_agent.is_touch_capable # returns False
  8. user_agent.is_pc # returns False
  9. user_agent.is_bot # returns False
  10. str(user_agent) # returns "BlackBerry 9700 / BlackBerry OS 5 / BlackBerry 9700"

常见机型映射字典

  1. map_phone = {'Apple': 'Apple', 'KIW-AL10': 'Huawei', 'PRA-TL10': 'Huawei', 'BND-AL00': 'Huawei', 'XiaoMi': 'XiaoMi', 'MIX 2': 'XiaoMi', 'Oppo': 'Oppo', ' Oppo': 'Oppo', 'Gionee': 'Gionee', 'Samsung': 'Samsung', 'PRA-AL00X': 'Huawei', 'PACM00': 'Oppo', 'PBET00': 'Oppo', 'R7Plusm': 'Oppo', 'PAAT00': 'Oppo', 'PBAM00': 'Oppo', 'PADM00': 'Oppo', 'PAFM00': 'Oppo', 'PBEM00': 'Oppo', 'PAAM00': 'Oppo', 'PBBM00': 'Oppo', 'PACT00': 'Oppo', 'V1809A': 'vivo', 'PBAT00': 'Oppo', 'PADT00': 'Oppo', 'BND-TL10': 'Huawei', 'PBBT00': ' Oppo', 'PBCM10': 'Oppo', 'Mi Note 3': 'XiaoMi', 'V1816A': 'vivo', 'V1732T': 'vivo', 'V1813A': 'vivo', 'V1732A': 'vivo', 'V1818A':'vivo','CAM-TL00':'Huawei','Le X620':'leshi','M6 Note':'meizu','m3 note':'meizu','M5':'meizu','M1 E ':'meizu','BLN-AL10':'Huawei','M5 Note':'meizu','PRA-AL00':'honour','LND-AL30':'honour','NEM-AL10':'honour','BND-AL10':'honour','CAM-AL00':'honour','SCL-TL00':'honour','LLD-AL30':'honour','BLN-AL20':'honour','AUM-AL20':'honour','JSN-AL00':'honour','LLD-AL10':'honour','BLN-TL10':'honour','LLD-AL20':'honour','BLN-AL40':'honour','MYA-AL10':'honour','LLD-AL00':'honour','JSN-AL00a':'honour','JMM-AL10':'honour','DLI-AL10':'honour','JMM-AL00':'honour','V1809T':'vivo','LND-AL40':'honour','PLK-AL10':'honour','MX6':'meizu','PLK-TL01H':'honour','S9':'Samsung','KIW-TL00':'honour','V1813T':'vivo'}
  1. 常见的User_Agent各字段的解释
  • Mozilla/5.0: 网景公司浏览器的标识,由于互联网初期浏览器市场主要被网景公司占领,很多服务器被设置成仅响应含有标志为Mozilla的浏览器的请求,因此,新款的浏览器为了打入市场,不得不加上这个字段。
  • Windows NT 6.3 : Windows 8.1的标识符
  • WOW64: 32位的Windows系统运行在64位的处理器上
  • AppleWebKit/537.36:苹果公司开发的呈现引擎
  • KHTML:是Linux平台中Konqueror浏览器的呈现引擎KHTML
  • Geckeo:呈现引擎
  • like Gecko:表示其行为与Gecko浏览器引擎类似
  1. 请求中为什么既含有Chrome/33.0.1750.29又含有Safari/537.36字段?
    因为AppleWebKit渲染引擎是苹果公司开发的,而Google公司要采用它,为了获得服务器端的正确响应,仅在Safari浏览器UA字段中增加了Chrome字段。
    例如:
    • Safari浏览器的UA:Mozilla/5.0 (平台;加密类型;操作系统或CPU;语言)AppleWebKit/AppleWebKit版本号(KHTML, like Gecko) Safari/Safari 版本号
    • Chrome浏览器的UA:Mozilla/5.0 (平台;加密类型;操作系统或CPU;语言)AppleWebKit/AppleWebKit版本号 (KHTML, like Gecko) Chrome/
    • Chrome 版本号 Safari/Safari 版本号
  2. 为什么UA中包含多个浏览器的标识,如:Mozilla/5.0、Chrome/33.0.1750.29、Safari/537.36,以及渲染引擎标识?

多增加一些字段都是为了让服务器检测到它支持的浏览器标识,以便获得服务器的响应,从而提升用户体验。

这里有一个demo代码请参考

  1. """
  2. Request工具类
  3. """
  4. import json
  5. import logging
  6. from django.contrib.auth.models import AbstractBaseUser
  7. from django.contrib.auth.models import AnonymousUser
  8. from django.core.cache import cache
  9. from django.urls.resolvers import ResolverMatch
  10. from user_agents import parse
  11. from apps.vadmin.utils.authentication import OpAuthJwtAuthentication
  12. logger = logging.getLogger(__name__)
  13. def get_request_user(request, authenticate=True):
  14. """
  15. 获取请求user
  16. (1)如果request里的user没有认证,那么则手动认证一次
  17. :param request:
  18. :param authenticate:
  19. :return:
  20. """
  21. user: AbstractBaseUser = getattr(request, 'user', None)
  22. if user and user.is_authenticated:
  23. return user
  24. try:
  25. user, tokrn = OpAuthJwtAuthentication().authenticate(request)
  26. except Exception as e:
  27. pass
  28. return user or AnonymousUser()
  29. def get_request_ip(request):
  30. """
  31. 获取请求IP
  32. :param request:
  33. :return:
  34. """
  35. ip = getattr(request, 'request_ip', None)
  36. if ip:
  37. return ip
  38. ip = request.META.get('REMOTE_ADDR', '')
  39. if not ip:
  40. x_forwarded_for = request.META.get('HTTP_X_FORWARDED_FOR', '')
  41. if x_forwarded_for:
  42. ip = x_forwarded_for.split(',')[-1].strip()
  43. else:
  44. ip = 'unknown'
  45. return ip
  46. def get_request_data(request):
  47. """
  48. 获取请求参数
  49. :param request:
  50. :return:
  51. """
  52. request_data = getattr(request, 'request_data', None)
  53. if request_data:
  54. return request_data
  55. data: dict = {**request.GET.dict(), **request.POST.dict()}
  56. if not data:
  57. try:
  58. body = request.body
  59. if body:
  60. data = json.loads(body)
  61. except Exception as e:
  62. pass
  63. if not isinstance(data, dict):
  64. data = {'data': data}
  65. return data
  66. def get_request_path(request, *args, **kwargs):
  67. """
  68. 获取请求路径
  69. :param request:
  70. :param args:
  71. :param kwargs:
  72. :return:
  73. """
  74. request_path = getattr(request, 'request_path', None)
  75. if request_path:
  76. return request_path
  77. values = []
  78. for arg in args:
  79. if len(arg) == 0:
  80. continue
  81. if isinstance(arg, str):
  82. values.append(arg)
  83. elif isinstance(arg, (tuple, set, list)):
  84. values.extend(arg)
  85. elif isinstance(arg, dict):
  86. values.extend(arg.values())
  87. if len(values) == 0:
  88. return request.path
  89. path: str = request.path
  90. for value in values:
  91. path = path.replace('/' + value, '/' + '{id}')
  92. return path
  93. def get_request_canonical_path(request, *args, **kwargs):
  94. """
  95. 获取请求路径
  96. :param request:
  97. :param args:
  98. :param kwargs:
  99. :return:
  100. """
  101. request_path = getattr(request, 'request_canonical_path', None)
  102. if request_path:
  103. return request_path
  104. path: str = request.path
  105. resolver_match: ResolverMatch = request.resolver_match
  106. for value in resolver_match.args:
  107. path = path.replace(f"/{value}", "/{id}")
  108. for key, value in resolver_match.kwargs.items():
  109. if key == 'pk':
  110. path = path.replace(f"/{value}", f"/{{id}}")
  111. continue
  112. path = path.replace(f"/{value}", f"/{{{key}}}")
  113. return path
  114. def get_browser(request, *args, **kwargs):
  115. """
  116. 获取浏览器名
  117. :param request:
  118. :param args:
  119. :param kwargs:
  120. :return:
  121. """
  122. ua_string = request.META['HTTP_USER_AGENT']
  123. user_agent = parse(ua_string)
  124. return user_agent.get_browser()
  125. def get_os(request, *args, **kwargs):
  126. """
  127. 获取操作系统
  128. :param request:
  129. :param args:
  130. :param kwargs:
  131. :return:
  132. """
  133. ua_string = request.META['HTTP_USER_AGENT']
  134. user_agent = parse(ua_string)
  135. return user_agent.get_os()
  136. def get_login_location(request, *args, **kwargs):
  137. """
  138. 获取ip 登录位置
  139. :param request:
  140. :param args:
  141. :param kwargs:
  142. :return:
  143. """
  144. import requests
  145. import eventlet # 导入eventlet这个模块
  146. request_ip = get_request_ip(request)
  147. # 从缓存中获取
  148. location = cache.get(request_ip)
  149. if location:
  150. return location
  151. # 通过api 获取,再缓存redis
  152. try:
  153. eventlet.monkey_patch(thread=False) # 必须加这条代码
  154. with eventlet.Timeout(2, False): # 设置超时时间为2秒
  155. apiurl = "http://whois.pconline.com.cn/ip.jsp?ip=%s" % request_ip
  156. r = requests.get(apiurl)
  157. content = r.content.decode('GBK')
  158. location = str(content).replace('\r', '').replace('\n', '')[:64]
  159. cache.set(request_ip, location, 86400)
  160. return location
  161. except Exception as e:
  162. pass
  163. return ""
  164. def get_verbose_name(queryset=None, view=None, model=None):
  165. """
  166. 获取 verbose_name
  167. :param request:
  168. :param view:
  169. :return:
  170. """
  171. try:
  172. if queryset and hasattr(queryset, 'model'):
  173. model = queryset.model
  174. elif view and hasattr(view.get_queryset(), 'model'):
  175. model = view.get_queryset().model
  176. elif view and hasattr(view.get_serializer(), 'Meta') and hasattr(view.get_serializer().Meta, 'model'):
  177. model = view.get_serializer().Meta.model
  178. if model:
  179. return getattr(model, '_meta').verbose_name
  180. except Exception as e:
  181. pass
  182. return ""