抖音爬虫

代码地址:https://gitee.com/duxinn/EartipDouyin.git
所在服务器:251
项目地址:/root/projects/EartipDouyin
部署方式:crontab

  1. 3 6-23/1 * * * flock -xn /tmp/douyin.lock -c 'cd /root/projects/EartipDouyin && /root/.virtualenvs/EartipDouyin-d8HskUpw/bin/python -u run.py >> "logs/running-$(date +"\%Y-\%m-\%d").log" 2>&1 '

项目环境管理:pipenv
项目环境依赖存储目录:requirements.txt

部署步骤:

  1. cd /root/projects/EartipDouyin
  2. git clone https://gitee.com/duxinn/EartipDouyin.git
  3. pipenv --python 3.7.4 # 创建pipenv环境
  4. pipenv shell # 进入pipenv环境
  5. # 安装依赖包
  6. pip install -r requirements.txt
  7. python run.py # 试运行
  8. # 查看python路径
  9. pipenv --py
  10. # 显示 /root/.virtualenvs/EartipDouyin-d8HskUpw/bin/python
  11. # 可放入crontab

搜狗爬虫

代码地址:https://gitee.com/duxinn/EartipSougou.git
所在服务器:251
项目地址:/root/projects/EartipSougou

部署方式:crontab

  1. 4 6-23/2 * * * flock -xn /tmp/sougou.lock -c 'cd /root/projects/EartipSougou && /root/.virtualenvs/EartipSougou-MYjfpg56/bin/python -u run_api.py >> "logs/running-$(date +"\%Y-\%m-\%d").log" 2>&1 '

项目环境管理:pipenv
项目环境依赖存储目录:requirements.txt

部署步骤:同抖音

B站

代码地址:https://gitee.com/duxinn/EartipBilibili.git
所在服务器:192.168.1.252
项目地址:/root/projects/EartipBilibili
部署方式:crontab

  1. 3 6-23/3 * * * flock -x /tmp/bibi.lock -c ' cd /root/projects/EartipBilibili && /root/.local/share/virtualenvs/EartipBilibili-9R72gDmP/bin/python -u run_sel
  2. enium.py >> "logs/running-$(date +"\%Y-\%m-\%d").log" 2>&1 '

项目环境管理:pipenv
项目环境依赖存储目录:requirements.txt

部署步骤:同抖音

贴吧

代码地址:https://gitee.com/duxinn/EartipTieba.git
所在服务器:192.168.1.252
项目地址:/root/projects/EartipTieba
部署方式:crontab

  1. # 贴子的搜索
  2. 5 6-23/3 * * * flock -x /tmp/tie.lock -c ' cd /root/projects/EartipTieba && /root/.local/share/virtualenvs/EartipTieba-A5puD_2X/bin/python -u run_selenium_tie.py >> "logs/running_tie-$(date +"\%Y-\%m-\%d").log" 2>&1 '
  3. # 吧的搜索
  4. 7 6-23/6 * * * flock -x /tmp/ba.lock -c ' cd /root/projects/EartipTieba && /root/.local/share/virtualenvs/EartipTieba-A5puD_2X/bin/python -u run_selenium_ba.py >> "logs/running_ba-$(date +"\%Y-\%m-\%d").log" 2>&1 '

项目环境管理:pipenv
项目环境依赖存储目录:requirements.txt

部署步骤:同抖音

贴吧的旋转识别服务:rotnet

代码地址:https://gitee.com/duxinn/rotnet.git
所在服务器:192.168.1.252
项目地址:/root/projects/EartipTieba
部署方式:docker+conda
项目环境管理:docker+conda

部署步骤:
1、创建容器

  1. 拉取manjaro镜像,注意20210725标签
  2. docker pull docker.io/manjarolinux/base:20210725
  3. 查看manjaro镜像
  4. docker image ls
  5. REPOSITORY TAG IMAGE ID CREATED SIZE
  6. docker.io/manjarolinux/base 20210725 8fddaa2b2126 2 months ago 1.17 GB
  7. 创建容器
  8. docker run -idt --privileged=true --restart=always --name tieba -p 9720:9720 manjaro镜像id
  9. # 进入容器
  10. docker exec -it -u root tieba /bin/bash

2、搭建容器内部环境

  1. 选择镜像
  2. sudo pacman-mirrors -i -c China -m rank
  3. 更新源
  4. sudo pacman -Syy
  5. 安装 yay 包管理工具
  6. pacman -S yay
  7. 添加用户name1
  8. useradd -d /home/name1 -m name1
  9. chown name1 -R /home/name1
  10. 修改密码,这里修改为name1
  11. passwd name1
  12. 添加为sudoer
  13. chmod +w /etc/sudoers
  14. vim /etc/sudoers
  15. root ALL=(ALL) ALL 下面
  16. 写入 name1 ALL=(ALL) ALL
  17. chmod -w /etc/sudoers
  18. 切换到name1
  19. sudo su - name1
  20. 安装conda环境(会输入密码)
  21. yay -S anaconda
  22. conda的配置
  23. ln -s /opt/anaconda/bin/conda /usr/bin/conda
  24. conda config --set auto_activate_base false
  25. conda config --set show_channel_urls yes

centos8

  1. 编译安装 Anaconda
  2. wget -P /tmp https://repo.anaconda.com/archive/Anaconda3-2021.11-Linux-x86_64.sh

参考 https://zhuanlan.zhihu.com/p/64930395
https://www.myfreax.com/how-to-install-anaconda-on-centos-8/

3、部署项目环境

  1. /root 拉代码
  2. git clone https://gitee.com/duxinn/rotnet.git
  3. 使用conda创建虚拟环境,命名为tieba
  4. conda create -n tieba python=3.7.4
  5. 查看虚拟环境 tieba
  6. conda info -e
  7. 进入虚拟环境
  8. conda activate tieba
  9. 按照readme.txt 中的说明安装tensorflow
  10. conda install tensorflow==2.0.0
  11. 按照 requirements.txt 安装pip
  12. pip install -r requirements.txt -i https://mirrors.aliyun.com/pypi/simple
  13. 运行
  14. python run.py

4、测试

  1. curl localhost:9722/index
  2. {"message":"Hello World"}
  3. 说明服务已经起来了

报错和解决

  1. Traceback (most recent call last):
  2. File "run.py", line 8, in <module>
  3. from utils import RotNetDataGenerator, angle_error
  4. File "/root/rotnet/utils/__init__.py", line 1, in <module>
  5. from .rotnet_utils import *
  6. File "/root/rotnet/utils/rotnet_utils.py", line 4, in <module>
  7. import cv2
  8. File "/opt/anaconda/envs/tieba/lib/python3.7/site-packages/cv2/__init__.py", line 3, in <module>
  9. from .cv2 import *
  10. ImportError: libSM.so.6: cannot open shared object file: No such file or directory
  11. pacman -S libsm libxext libxrender
  12. centos环境
  13. yum install libSM libXext libXrender

CRM

代码地址:https://gitee.com/zpy-git/EartipCrmBackend.git
所在服务器:119.23.211.94
项目地址:docker crm:/root/EartipCrmBackend
部署方式:nginx+docker
项目环境管理:无
项目环境依赖存储目录:requirements.txt

部署步骤:

  1. 拉取manjaro镜像后运行manjaro容器
  2. docker run -idt --restart=always --name crm -p 9015:9015 docker.io/manjarolinux/base:20210725
  3. # 进入容器
  4. docker exec -it -u root crm /bin/bash
  5. 拉取代码
  6. git clone -b master https://gitee.com/zpy-git/EartipCrmBackend.git
  7. 安装
  8. pip install -r requirements.txt
  9. 运行
  10. python manage.py runserver 0:9015

nginx 部署文件:
/usr/local/nginx/conf/nginx.conf

  1. server {
  2. listen 80;
  3. listen 443 ssl;
  4. server_name crm.eartip.cn;
  5. ssl_certificate /usr/local/nginx/cert/crm/6058513_crm.eartip.cn.pem;
  6. ssl_certificate_key /usr/local/nginx/cert/crm/6058513_crm.eartip.cn.key;
  7. ssl_session_cache shared:SSL:1m;
  8. ssl_session_timeout 5m;
  9. fastcgi_param HTTPS on;
  10. fastcgi_param HTTP_SCHEME https;
  11. access_log /usr/local/nginx/logs/httpsaccess.log;
  12. root /root/project/EartipCrmBackend;
  13. location / {
  14. proxy_pass http://127.0.0.1:9015;
  15. proxy_set_header Host $host;
  16. proxy_set_header X-real-ip $remote_addr;
  17. proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  18. }
  19. }

FoodCrm

代码地址:https://gitee.com/duxinn/FoodCrm.git
所在服务器:119.23.211.94
项目地址:/root/project/FoodCrm
部署方式:nginx+pipenv
项目环境管理:无
项目环境依赖存储目录:requirements.txt

部署步骤:

  1. 拉取代码
  2. git clone -b master https://gitee.com/duxinn/FoodCrm.git
  3. pipenv --python 3.7
  4. pipenv shell
  5. 安装
  6. pip install django
  7. pip install mysqlclient
  8. pip install cacheout
  9. pip install django-cors-headers
  10. 建库
  11. CREATE DATABASE `food_crmdb` DEFAULT CHARACTER SET utf8
  12. 运行
  13. python manage.py runserver 0:9020
  1. server {
  2. listen 9021;
  3. fastcgi_param HTTPS on;
  4. fastcgi_param HTTP_SCHEME https;
  5. access_log /usr/local/nginx/logs/httpsaccess.log;
  6. root /root/project/FoodCrm/dist;
  7. index index.html index.htm;
  8. location / {
  9. try_files $uri /index.html;
  10. }
  11. location /api/ {
  12. proxy_pass http://127.0.0.1:9020;
  13. proxy_set_header Host $host;
  14. proxy_set_header X-real-ip $remote_addr;
  15. proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  16. }
  17. }