CentOS下python3 selenium3 使用Chrome的无头浏览器截取网页全屏图片

https://blog.csdn.net/weixin_44905132/article/details/121990034

前言
selenium是一个模拟浏览器的自动化执行框架，但是如果每次执行都要打开浏览器来处理任务的话，效率上都不高。最重要的是如果安装在Centos8服务器环境下，打开浏览器来模拟操作是更加不合适的，尤其是碰上需要截取网页图片这样的需求。
这时候就要考虑使用Chrome的无头浏览器模式了。所谓的无头浏览器模式也就是不需要打开浏览器，但是却可以起到模拟打开浏览器的执行效果，一切无界面执行。
下面来看看如果安装部署到执行。
1.安装chrome
1.1 添加google的repo源
vim /etc/yum.repos.d/google.repo 1
在打开的空文件中填入以下内容
[google] name=Google-x8664 baseurl=http://dl.google.com/linux/rpm/stable/x86_64 enabled=1 gpgcheck=0 gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub 123456
[google-chrome] name=google-chrome baseurl=http://dl.google.com/linux/chrome/rpm/stable/$basearch enabled=1 gpgcheck=0 gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub 123456
1.2 使用yum安装chrome浏览器（不是root用户前面加sudo）
yum makecache 1
sudo yum install google-chrome-stable -y 1
sudo yum install google-chrome-stable -y 1
2.安装chromedriver驱动
2.1 查看chrome的版本
安装成功之后，查看安装的chrom版本如下
[root@locust03 ~]# google-chrome —version Google Chrome 96.0.4664.45 [root@locust03 ~]# 123
2.2 下载chromedriver
selenium如果想要执行chrome浏览器的话，是需要安装驱动chromedriver的，而下载chromedriver可以从两个地方去下载，
点击访问如下：
点击访问官网
点位访问国内淘宝镜像地址
那么其实一般都是访问国内的镜像地址，如下：
点击访问官网
点位访问国内淘宝镜像地址

找到与自己相对应的最新版本，进行下载
因为我准备安装在Centos8服务器上，所以选择linux64位的版本。
wget http://npm.taobao.org/mirrors/chromedriver//96.0.4664.45/chromedriver_linux64.zip 1
wget http://npm.taobao.org/mirrors/chromedriver//98.0.4758.102/chromedriver_linux64.zip 1
2.3 添加至环境变量$PATH
# 1.进入opt目录 [root@server opt]# cd /opt/ # 2.下载chromdirver [root@server opt]# wget http://npm.taobao.org/mirrors/chromedriver/78.0.3904.105/chromedriver_linux64.zip # 3.解压zip包 [root@server opt]# unzip chromedriverlinux64.zip # 4.得到一个二进制可执行文件 [root@server opt]# ls -ll chromedriver -rwxrwxr-x 1 root root 11610824 Nov 19 02:20 chromedriver # 5. 创建存放驱动的文件夹driver [root@server opt]# mkdir -p /opt/driver/bin # 6.将chromedirver放入文件夹driver中bin下 [root@server opt]# mv chromedriver /opt/driver/bin/ 12345678910111213
配置环境变量如下：
[root@server driver]# vim /etc/profile … # 添加内容 export DRIVER=/opt/driver export PATH=$PATH:$DRIVER/bin 12345
设置环境变量立即生效，并执行全局命令查看chromedirver版本：
[root@server ~]# source /etc/profile [root@server ~]# [root@server ~]# chromedriver —version ChromeDriver 78.0.3904.105 (60e2d8774a8151efa6a00b1f358371b1e0e07ee2-refs/branch-heads/3904@{#877}) [root@server ~]# 12345
能全局执行chromedriver说明环境配置生效了。
3. 安装selenium
selenium可以在你项目的虚拟环境中简单地用pip安装
pip3 install selenium 1
[root@server seleniumex]# pip3 install selenium Looking in indexes: http://mirrors.tencentyun.com/pypi/simple Collecting selenium Downloading http://mirrors.tencentyun.com/pypi/packages/80/d6/4294f0b4bce4de0abf13e17190289f9d0613b0a44e5dd6a7f5ca98459853/selenium-3.141.0-py2.py3-none-any.whl (904kB) |████████████████████████████████| 911kB 990kB/s Requirement already satisfied: urllib3 in /usr/local/python3/lib/python3.7/site-packages (from selenium) (1.25.6) Installing collected packages: selenium Successfully installed selenium-3.141.0 [root@locust03 seleniumex]# 123456789
4. 脚本测试
编写一个test.py的脚本，如下：
from selenium.webdriver import Chrome from selenium.webdriver.chrome.options import Options import time import os.path # 配置驱动路径 DRIVER_PATH = ‘/opt/driver/bin/chromedriver’ if __name == “__main“: # 设置浏览器 options = Options() options.add_argument(‘—no-sandbox’) options.add_argument(‘—headless’) # 无头参数 options.add_argument(‘—disable-gpu’) # 启动浏览器 driver = Chrome(executable_path=DRIVER_PATH, options=options) driver.maximize_window() try: # 访问页面 url = ‘https://www.jianshu.com/u/a94f887f8776‘ driver.get(url) time.sleep(1) # 设置截屏整个网页的宽度以及高度 scroll_width = 1600 scroll_height = 1500 driver.set_window_size(scroll_width, scroll_height) # 保存图片 img_path = os.getcwd() img_name = time.strftime(‘%Y-%m-%d-%H-%M-%S’, time.localtime(time.time())) img = “%s.png” % os.path.join(img_path, img_name) driver.get_screenshot_as_file(img) # 关闭浏览器 driver.close() driver.quit() except Exception as e: print(e) 1234567891011121314151617181920212223242526272829303132333435363738394041
在服务器上执行如下： 1
[root@server selenium_ex]# python3 test.py [root@server selenium_ex]# [root@server selenium_ex]# ls 2019-11-28-15-06-48.png test.py [root@server selenium_ex]# 12345
将图片下载查看一下，如下：

可以看到已经能够正常模拟浏览器登陆，并且截取网页的图片下来。可以从图片中看到，凡是中文的地方都是显示方框的符号，这是因为Centos8默认下是没有安装中文字体的，所以chrom浏览器打开就无法正常显示中文。
linux使用——CentOS8安装中文字体
背景
项目开发中，给照片添加水印时，发现添加的字体显示为“口口口口口口”，上网寻找答案，发现是系统部署的Linux服务器不支持水印中的中文字体。
解决方法
服务器Linux系统中导入中文字体
系统环境
CentOS8
步骤
使用root用户登录系统
查看已经安装的中文字体：fc-list :lang=zh

创建中文字体目录并分配权限
创建目录：
mkdir /usr/share/fonts/chinese 1
分配权限：
chmod -R 777 /usr/share/fonts/chinese 1
备注：CentOS字体所在目录为 /usr/share/fonts下
从windows系统上查找所需要的中文字体

导入到/usr/share/fonts/chinese目录下
windows字体目录存在C:\Windows\Fonts下，字体文件一般都是.TTF .TTC 后缀
修改Linux字体配置文件
编辑字体配置文件：vim /etc/fonts/fonts.conf
增加中文字体目录，如下图：

/usr/share/fonts/chinese 1

刷新缓存：fc-cache
如果fc-cache失效，重启下系统
————————————————
截图原文链接：https://blog.csdn.net/u012887259/article/details/103306861
中文安装参考连接：https://www.cnblogs.com/zuiyue_jing/p/15152491.html
python:安装链接:https://blog.csdn.net/weixin_44905132/article/details/121974629

CentOS下python3 selenium3 使用Chrome的无头浏览器 截取网页全屏图片