爬虫 - 【Python 爬虫】爬虫之lxml报错：ValueError: Unicode strings with encoding declaration are not supported. Please use bytes_zhaojiafu666的博客-CSDN博客 - 《🐍 Python 入坑教程》

最终的解决方法：

【Python 爬虫】爬虫之lxml报错：ValueError: Unicode strings with encoding declaration are not supported. Please use bytes_zhaojiafu666的博客-CSDN博客 - 图1

先说明一下，不要问我网站，因为工作原因，网站不会给你，还望谅解。如果你使用 lxml 提取数据是报的错误和标题差不多，可以来参考参考我的解决方法，因为我也是第一次遇见这种问题，所以记录下来。

今天测试一个网站，然后遇见一个问题，使用 reqest 请求，直接使用 resp.text，返回的数据是没有问题的。测试代码如下：

resp = requests.get(url,headers=headers)
resp_text = resp.text
html = etree.HTML(resp_text)

然后我是 etree.HTML() 提取函数就报错。报错就在html = etree.HTML(resp_text)这一行。

然后我又使用 chardet 的测试字节的编码格式是 gb2312，
测试代码：

resp_text = resp.content
ren = chardet.detect(resp_text)
print(ren)

然后以为需要显解码，但是直接 text 也打印正常呀，没办法，试试吧。

resp_text = resp.content.decode('gb2312')
tml = etree.HTML(resp_text)

报错还是tml = etree.HTML(resp_text)这一行。

报错代码：

  File "src\lxml\etree.pyx", line 3170, in lxml.etree.HTML
  File "src\lxml\parser.pxi", line 1872, in lxml.etree._parseMemoryDocument
ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

弄的这个问题心碎

【Python 爬虫】爬虫之lxml报错：ValueError: Unicode strings with encoding declaration are not supported. Please use bytes_zhaojiafu666的博客-CSDN博客 - 图2

最终的解决方法：

这也是尝试好多次，才得以成功。经过多次测试，原来还需要将解密的字符串，在 python 中使用utf-8编码一下传入就可以了。

resp = requests.get(url,headers=headers)
resp_text = resp.content.decode('gb2312')
html = etree.HTML(resp_text.encode('utf-8'))

或者

resp = requests.get(url,headers=headers)
resp_text = resp.text
html = etree.HTML(resp_text.encode('utf-8'))

至此问题解决，之前直接传入字符串是每页问题的，估计是这个网站的编码格式的问题，下次再遇见这种问题，优先尝试这个解决方法了。
【Python 爬虫】爬虫之lxml报错：ValueError: Unicode strings with encoding declaration are not supported. Please use bytes_zhaojiafu666的博客-CSDN博客 - 图3

一个小问题弄了快俩小时了，唉。。。

【Python 爬虫】爬虫之lxml报错：ValueError: Unicode strings with encoding declaration are not supported. Please use bytes_zhaojiafu666的博客-CSDN博客 - 图4

如果帮助到你了，欢迎点个赞哈【Python 爬虫】爬虫之lxml报错：ValueError: Unicode strings with encoding declaration are not supported. Please use bytes_zhaojiafu666的博客-CSDN博客 - 图5
https://blog.csdn.net/weixin_42081389/article/details/103891908