Gunicorn prefork流程
python 中怎么实现的?
用的知识,和简单的思路。
下面是阅读Gunicorn源码之后,实现的一个简单的 pre-fork 程序。
# -*- coding: utf-8 -*-
#master-slaves.py python2.7.x
#orangleliu@gmail.com
'''
简单的模拟pre-fork模式,master进程控制多个子进程
这里实现这么几个信号
INT ctrl+c 退出
TTIN 增加一个worker
TTOU 减少一个worker
'''
import os
import sys
import signal
import time
import random
class Worker(object):
'''
子进程要实现一些特定的信号来响应外界和父进程的操作
'''
def run(self):
while True:
time.sleep(3)
class Master(object):
WORKERS = {}
SIG_QUEUE = []
SIGNALS = [getattr(signal, "SIG%s" % x)
for x in "INT TTIN TTOU".split()]
SIG_NAMES = dict(
(getattr(signal, name), name[3:].lower()) for name in dir(signal)
if name[:3] == "SIG" and name[3] != "_"
)
def __init__(self, worker_nums=2):
self.worker_nums = worker_nums
self.master_name = "Master"
self.reexec_pid = 0
def start(self):
print "start master"
self.pid = os.getpid()
self.init_signals()
def init_signals(self):
[signal.signal(s, self.signal) for s in self.SIGNALS]
signal.signal(signal.SIGCHLD, self.handle_chld)
def signal(self, sig, frame):
'''
普通的信号发生的时候,往信号队列增加一个信号
'''
if len(self.SIG_QUEUE) < 5:
self.SIG_QUEUE.append(sig)
def run(self):
self.start()
try:
self.manage_workers()
while True:
# 如果不增加sleep 整个master进程就会进入几乎100 cpu的状态
# 使用sleep的好处就是master的cpu消耗小很多,对于来自系统的给master的信号可以即使反馈
time.sleep(1)
sig = self.SIG_QUEUE.pop(0) if len(self.SIG_QUEUE) else None
if sig is None:
self.manage_workers()
continue
if sig not in self.SIG_NAMES:
print "unknow signals:%s"%sig
continue
signame = self.SIG_NAMES.get(sig)
handler = getattr(self, "handle_%s"%signame, None)
if not handler:
print "Unhandler signal: %s"%signame
continue
handler()
except StopIteration:
self.halt()
except KeyboardInterrupt:
self.halt()
except SystemExit:
pass
except Exception as e:
print e
self.stop()
sys.exit(-1)
def handle_chld(self, sig, frame):
'''
对于子进程退出SIGCHLD信号处理,防止产生大量僵尸进程
'''
self.reap_workers()
def handle_int(self):
'''
ctrl+c 关闭master进程,先关闭子进程,然后抛出异常,自己退出
'''
self.stop()
raise StopIteration
def handle_ttin(self):
'''
增加一个子进程
'''
print "add a worker"
self.worker_nums += 1
self.manage_workers()
def handle_ttou(self):
'''
减少一个子进程
'''
print "deincrease a worker"
if self.worker_nums <= 1:
return
self.worker_nums -= 1
self.manage_workers()
def stop(self):
'''
停止子进程 这里都当做SIGTERM来处理
'''
print 'stop workers'
sig = signal.SIGTERM
self.kill_workers(sig)
self.kill_workers(signal.SIGKILL)
def halt(self, exit_status=0):
'''
master 进程自杀
'''
print "master exit"
self.stop()
sys.exit(exit_status)
def reap_workers(self):
'''
这里的检测也是为了避免僵尸进程,否则大量资源无法释放
参考:http://www.cnblogs.com/mickole/p/3187770.html
'''
try:
while True:
#os.waitpid 收集僵尸子进程的信息,并把它彻底销毁后返回
#这里的 -1 代表所有子进程
#os.WNOHANG 如果没有子进程信息就立刻返回
wpid, status = os.waitpid(-1, os.WNOHANG)
if not wpid:
break
else:
exitcode = status >> 8
worker = self.WORKERS.pop(wpid, None)
if not worker:
continue
except OSError as e:
#errno.ECHILD 是没有子进程错误
if e.error != errno.ECHILD:
raise
def manage_workers(self):
'''
workers 的健康检查,数量是否对齐等
'''
if len(self.WORKERS.keys()) < self.worker_nums:
self.spawn_workers()
workers = self.WORKERS.items()
while len(workers) > self.worker_nums:
(pid, _) = workers.pop(0)
self.kill_worker(pid, signal.SIGTERM)
def spawn_worker(self):
worker = Worker()
pid = os.fork()
#master进程处理
if pid != 0:
self.WORKERS[pid] = worker
return pid
#worker进程处理
worker_pid = os.getpid()
try:
worker.run()
sys.exit(0)
except SystemExit:
raise
except Exception as e:
print "work error %s"%str(e)
sys.exit(-1)
def spawn_workers(self):
for i in range(self.worker_nums - len(self.WORKERS.keys())):
self.spawn_worker()
#为什么要那么端时间的休眠
time.sleep(0.1*random.random())
def kill_workers(self, sig):
worker_pids = list(self.WORKERS.keys())
for pid in worker_pids:
self.kill_worker(pid, sig)
def kill_worker(self, pid, sig):
try:
os.kill(pid, sig)
except OSError as e:
print "kill worker error: %s"%str(e)
if __name__ == "__main__":
Master().run()
Gunicorn worker 类型
- sync
- gthread
- eventlet
- gevent
- tornado
根据底层动作的原理可以将worker分成三种类型:
- sync:底层实际是每个请求一个process处理
- gthread:底层实际是每个请求一个thread处理
- eventlet/gevent/tarnado:底层则是利用异步IO让一个process在等待IO响应时继续处理下个请求
用 process 处理请求
使用 sync 类型的worker运行 CPU bound/IO bound的任务在性能上的表现
# views.py
from django.shortcuts import render
from django.http import HttpResponse
# Create your views here.
import time
def ioTask(request):
time.sleep(2)
return HttpResponse("IO bound task finish!\n")
def cpuTask(request):
for i in range(10000000):
n = i * i * i
return HttpResponse("CPU bound task finish!\n")
输出
08:31:40 (gunicorn_demo-bLt-GVNF) root@arch gdemo → siege -c 2 -r 1 http://192.168.37.145/worker/io/ -v
** SIEGE 4.0.4
** Preparing 2 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 2.00 secs: 22 bytes ==> GET /worker/io/
# 下面请求开始被阻塞
HTTP/1.1 200 4.00 secs: 22 bytes ==> GET /worker/io/
Transactions: 2 hits
Availability: 100.00 %
Elapsed time: 4.00 secs
Data transferred: 0.00 MB
Response time: 3.00 secs
Transaction rate: 0.50 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 1.50
Successful transactions: 2
Failed transactions: 0
Longest transaction: 4.00
Shortest transaction: 2.00
08:40:51 root@arch ~ → siege -c 2 -r 1 http://192.168.37.145/worker/cpu/ -v
** SIEGE 4.0.4
** Preparing 2 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 0.97 secs: 23 bytes ==> GET /worker/cpu/
# 下面请求开始被阻塞
HTTP/1.1 200 2.12 secs: 23 bytes ==> GET /worker/cpu/
Transactions: 2 hits
Availability: 100.00 %
Elapsed time: 2.12 secs
Data transferred: 0.00 MB
Response time: 1.54 secs
Transaction rate: 0.94 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 1.46
Successful transactions: 2
Failed transactions: 0
Longest transaction: 2.12
Shortest transaction: 0.97
这种类型的好处是错误隔离高,一个 process 挂掉只会影响该 process 当下服务的请求,而不会影响其他请求。
坏处则为 process 资源开销较大,开太多 worker 时对内存或 CPU 的影响很大,因此 并发concurrency 理论上限极低。
用 thread 处理请求
当 gunicorn worker type 用 gthread 时,可额外加参数 —thread 指定每个 process 能开的 thread 数量,此时 concurrency 的上限为 worker 数量乘以给个 worker 能开的 thread 数量。
如下 gunicorn 启动时开了一个 pid 为 595 的 process 来处理请求, thread 数量为 2,理论上每次只能处理二个请求:
09:05:31 (gunicorn_demo-bLt-GVNF) root@arch gdemo → gunicorn -w 1 -k sync --thread=2 gdemo.wsgi -b 192.168.37.145:80
[2018-06-24 09:05:41 +0800] [18464] [INFO] Starting gunicorn 19.8.1
[2018-06-24 09:05:41 +0800] [18464] [INFO] Listening at: http://192.168.37.145:80 (18464)
[2018-06-24 09:05:41 +0800] [18464] [INFO] Using worker: threads
[2018-06-24 09:05:41 +0800] [18467] [INFO] Booting worker with pid: 18467
[2018-06-24 09:19:59 +0800] [18464] [INFO] Handling signal: winch
[2018-06-24 09:20:05 +0800] [18464] [INFO] Handling signal: winch
用 siege 分别对 IO bound task 和 CPU bound task 发出 4 个请求可以明显看到第三个请求以后才会被阻塞:
09:22:34 root@arch ~ → siege -c 4 -r 1 http://192.168.37.145/worker/io/ -v
** SIEGE 4.0.4
** Preparing 4 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
# 下面的请求开始被阻塞
HTTP/1.1 200 4.02 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 4.01 secs: 22 bytes ==> GET /worker/io/
Transactions: 4 hits
Availability: 100.00 %
Elapsed time: 4.02 secs
Data transferred: 0.00 MB
Response time: 3.01 secs
Transaction rate: 1.00 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 3.00
Successful transactions: 4
Failed transactions: 0
Longest transaction: 4.02
Shortest transaction: 2.01
09:23:39 root@arch ~ → siege -c 4 -r 1 http://192.168.37.145/worker/cpu/ -v
** SIEGE 4.0.4
** Preparing 4 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 2.00 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 2.00 secs: 23 bytes ==> GET /worker/cpu/
# 下面的请求开始被阻塞
HTTP/1.1 200 3.97 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 3.97 secs: 23 bytes ==> GET /worker/cpu/
Transactions: 4 hits
Availability: 100.00 %
Elapsed time: 3.97 secs
Data transferred: 0.00 MB
Response time: 2.99 secs
Transaction rate: 1.01 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 3.01
Successful transactions: 4
Failed transactions: 0
Longest transaction: 3.97
Shortest transaction: 2.00
这种类型的 worker 好处是 concurrency 理论上限会比 process 高,坏处依然是 thread 数量,OS 中 thread 数量是有限的,过多的 thread 依然会造成系统负担。
用异步IO处理每个请求
当 gunicorn worker type 用 eventlet、gevent、tarnado 等类型时,每个请求都由同一个 process 处理,而当遇到 IO 时该 process 不会等 IO 回应,会继续处理下个请求直到该 IO 完成,理论上 concurrency 无上限。
以 gevent 为例,gunicorn 启动时开了一个 pid 为 733 的 process 来处理请求:
09:44:31 (gunicorn_demo-bLt-GVNF) root@arch gdemo → gunicorn -w 1 -k gevent gdemo.wsgi -b 192.168.37.145:80
[2018-06-24 09:47:35 +0800] [36301] [INFO] Starting gunicorn 19.8.1
[2018-06-24 09:47:35 +0800] [36301] [INFO] Listening at: http://192.168.37.145:80 (36301)
[2018-06-24 09:47:35 +0800] [36301] [INFO] Using worker: gevent
[2018-06-24 09:47:35 +0800] [36304] [INFO] Booting worker with pid: 36304
用 siege 对 IO bound task 发出 10 个请求可以明显看到没有任何请求被阻塞:
10:06:55 root@arch ~ → siege -c 10 -r 1 http://192.168.37.145/worker/io/ -v
** SIEGE 4.0.4
** Preparing 10 concurrent users for battle.
The server is now under siege...
# 可以明显看到没有任何请求被阻塞
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.00 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.00 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.00 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
HTTP/1.1 200 2.01 secs: 22 bytes ==> GET /worker/io/
Transactions: 10 hits
Availability: 100.00 %
Elapsed time: 2.02 secs
Data transferred: 0.00 MB
Response time: 2.01 secs
Transaction rate: 4.95 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 9.94
Successful transactions: 10
Failed transactions: 0
Longest transaction: 2.01
Shortest transaction: 2.00
但当面临 CPU bound 请求时,则会退化成用 process 处理请求一样,concurrency 上限为 worker 数量。如下用 siege 对 CPU bound task 发出 10 个请求,可以看到第二个请求以后就被阻塞:
10:07:38 root@arch ~ → siege -c 10 -r 1 http://192.168.37.145/worker/cpu/ -v
** SIEGE 4.0.4
** Preparing 10 concurrent users for battle.
The server is now under siege...
HTTP/1.1 200 0.96 secs: 23 bytes ==> GET /worker/cpu/
# 下面请求开始被阻塞
HTTP/1.1 200 1.90 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 2.89 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 4.16 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 5.40 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 6.38 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 7.34 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 8.30 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 9.51 secs: 23 bytes ==> GET /worker/cpu/
HTTP/1.1 200 10.52 secs: 23 bytes ==> GET /worker/cpu/
Transactions: 10 hits
Availability: 100.00 %
Elapsed time: 10.52 secs
Data transferred: 0.00 MB
Response time: 5.74 secs
Transaction rate: 0.95 trans/sec
Throughput: 0.00 MB/sec
Concurrency: 5.45
Successful transactions: 10
Failed transactions: 0
Longest transaction: 10.52
Shortest transaction: 0.96
因此使用非同步类型的 worker 好处和坏处非常明显,对 IO bound task 的高效能,但在 CPU bound task 会不如 thread。
结论
当谈到效能时,必须考虑到使用情境。 gunicorn + 异步IO 效能就一定比较好的说法并不一定成立。
从上面的数据三种类型的 worker 都有其相对适合的场景:
- 当需要稳定的系统时, 用 process 处理请求可以保证一个请求的异常导致程式 crash 不会影响到其他请求。
- 当 web 服务内大部分都是 cpu 运算时,用 thread 可以提供不错的效能。
- 当 web 服务内大部分都是 io 时,用非同步 io 可以达到极高的 concurrency 数量。
附录
名词解释
websocket
是一个新协议,跟http协议基本没有关系,只是为了兼容现有浏览器的握手规范而已,也就是说它是http协议上的一种补充。
WebSocket 是一个持久化协议,相对于HTTP这种非持久的协议来说。
简单的举个例子吧,用目前应用比较广泛的PHP生命周期来解释。
- HTTP的生命周期通过Request来界定,也就是一个Request 一个Response,那么在HTTP1.0中,这次HTTP请求就结束了。
- 在HTTP1.1中进行了改进,使得有一个keep-alive,也就是说,在一个HTTP连接中,可以发送多个Request,接收多个Response。
但是请记住 Request = Response , 在HTTP中永远是这样,也就是说一个request只能有一个response。而且这个response也是被动的,不能主动发起。
跟Websocket有什么关系呢?
Websocket是基于HTTP协议的,或者说借用了HTTP的协议来完成一部分握手。
在握手阶段是一样的
首先我们来看个典型的Websocket握手(借用Wikipedia的。。)
GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
Origin: http://example.com
熟悉HTTP的童鞋可能发现了,这段类似HTTP协议的握手请求中,多了几个东西。
我会顺便讲解下作用。
Upgrade: websocket
Connection: Upgrade
这个就是Websocket的核心了,告诉Apache、Nginx等服务器:
注意啦,窝发起的是Websocket协议,快点帮我找到对应的助理处理~不是那个老土的HTTP。
Sec-WebSocket-Key: x3JJHMbDL1EzLkh9GBhXDw==
Sec-WebSocket-Protocol: chat, superchat
Sec-WebSocket-Version: 13
首先,Sec-WebSocket-Key 是一个Base64 encode的值,这个是浏览器随机生成的,告诉服务器:泥煤,不要忽悠窝,我要验证尼是不是真的是Websocket助理。
然后,SecWebSocket-Protocol 是一个用户定义的字符串,用来区分同URL下,不同的服务所需要的协议。简单理解:今晚我要服务A,别搞错啦~
最后,Sec-WebSocket-Version 是告诉服务器所使用的Websocket Draft(协议版本),在最初的时候,Websocket协议还在 Draft 阶段,各种奇奇怪怪的协议都有,而且还有很多期奇奇怪怪不同的东西,什么Firefox和Chrome用的不是一个版本之类的,当初Websocket协议太多可是一个大难题。。不过现在还好,已经定下来啦大家都使用的一个东西 脱水:**服务员,我要的是13岁的噢→→**
然后服务器会返回下列东西,表示已经接受到请求, 成功建立Websocket啦!
HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: HSmrc0sMlYUkAGmm5OPpG2HaGWk=
Sec-WebSocket-Protocol: chat
这里开始就是HTTP最后负责的区域了,告诉客户,我已经成功切换协议啦~
Upgrade: websocket
Connection: Upgrade
依然是固定的,告诉客户端即将升级的是Websocket协议,而不是mozillasocket,lurnarsocket或者shitsocket。
然后,Sec-WebSocket-Accept 这个则是经过服务器确认,并且加密过后的 Sec-WebSocket-Key。服务器:好啦好啦,知道啦,给你看我的ID CARD来证明行了吧。。
后面的,Sec-WebSocket-Protocol 则是表示最终使用的协议。
至此,HTTP已经完成它所有工作了,接下来就是完全按照Websocket协议进行了。
具体的协议就不在这阐述了。