参考 使用 gdb 调试运行中的 Python 进程 DebuggingWithGdb - Python Wiki

假设一个服务器上运行了下面这样的 test.py 程序,我们怎样才能知道程序是否在正常运行,运行到哪一步了呢?

  1. import time
  2. def do(x):
  3. time.sleep(10)
  4. def main():
  5. for x in range(10000):
  6. do(x)
  7. if \_\_name\_\_ \== '\_\_main\_\_':
  8. main()

这个程序既没有日志也没有 print 输出,通过查看日志文件 / 标准输出 / 标准错误是没有办法确认程序状况的。 一种可行的办法就是使用 gdb 来查看程序当前的运行状况。

测试环境

  • 系统: Ubuntu 16.04.1 LTS
  • Python: 2.7.12

准备工作

安装 gdb 和 python2.7-dbg:

  1. $ sudo apt-get install gdb python2.7-dbg

设置 /proc/sys/kernel/yama/ptrace_scope:

  1. $ echo 0 |sudo tee /proc/sys/kernel/yama/ptrace\_scope

运行 test.py:

  1. $ python test.py &
  2. [1] 6489

通过 gdb python PID 来调试运行中的进程:

  1. $ gdb python 6489
  2. GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
  3. ...
  4. For help, type "help".
  5. Type "apropos word" to search for commands related to "word"...
  6. Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/90/d1300febaeb0a626baa2540d19df2416cd3361.debug...done.
  7. done.
  8. ...
  9. Reading symbols from /lib/ld-linux.so.2...Reading symbols from /usr/lib/debug//lib/i386-linux-gnu/ld-2.23.so...done.
  10. done.
  11. 0xb778fc31 in __kernel_vsyscall ()
  12. (gdb)

生成 core file

为了不影响运行中的进程,可以通过生成 core file 的方式来保存进程的当前信息:

  1. (gdb) generate-core-file
  2. warning: target file /proc/6489/cmdline contained unexpected null characters
  3. Saved corefile core.6489
  4. (gdb) quit
  5. A debugging session is active.
  6. Inferior 1 [process 6489] will be detached.
  7. Quit anyway? (y or n) y

可以通过 gdb python core.PID 的方式来读取 core file:

  1. $ gdb python core.6489
  2. GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.04) 7.11.1
  3. ...
  4. Type "apropos word" to search for commands related to "word"...
  5. Reading symbols from python...Reading symbols from /usr/lib/debug/.build-id/90/d1300febaeb0a626baa2540d19df2416cd3361.debug...done.
  6. done.
  7. warning: core file may not match specified executable file.
  8. [New LWP 6489]
  9. [Thread debugging using libthread_db enabled]
  10. Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".
  11. Core was generated by `python'.
  12. #0 0xb778fc31 in __kernel_vsyscall ()
  13. (gdb)

可用的 python 相关的命令

可以通过输入 py 然后加 tab 键的方式来查看可用的命令:

  1. (gdb) py
  2. py-bt py-down py-locals py-up python-interactive
  3. py-bt-full py-list py-print python

可以通过 help cmd 查看各个命令的说明:

  1. (gdb) help py-bt
  2. Display the current python frame and all the frames within its call stack (if any)

当前执行位置的源码

  1. (gdb) py-list
  2. 1 # -*- coding: utf-8 -*-
  3. 2 import time
  4. 3
  5. 4
  6. 5 def do(x):
  7. >6 time.sleep(10)
  8. 7
  9. 8
  10. 9 def main():
  11. 10 for x in range(10000):
  12. 11 do(x)
  13. (gdb)

可以看到当前正在执行 time.sleep(10)

当前位置的调用栈

  1. (gdb) py-bt
  2. Traceback (most recent call first):
  3. <built-in function sleep>
  4. File "test.py", line 6, in do
  5. time.sleep(10)
  6. File "test.py", line 11, in main
  7. do(x)
  8. File "test.py", line 15, in <module>
  9. main()
  10. (gdb)

可以看出来是 main() -> do(x) -> time.sleep(10)

查看变量的值

  1. (gdb) py-list
  2. 1 # -*- coding: utf-8 -*-
  3. 2 import time
  4. 3
  5. 4
  6. 5 def do(x):
  7. >6 time.sleep(10)
  8. 7
  9. 8
  10. 9 def main():
  11. 10 for x in range(10000):
  12. 11 do(x)
  13. (gdb) py-print x
  14. local 'x' = 12
  15. (gdb)
  16. (gdb) py-locals
  17. x = 12
  18. (gdb)

查看上层调用方的信息

  1. (gdb) py-up
  2. #9 Frame 0xb74c0994, for file test.py, line 11, in main (x=12)
  3. do(x)
  4. (gdb) py-list
  5. 6 time.sleep(10)
  6. 7
  7. 8
  8. 9 def main():
  9. 10 for x in range(10000):
  10. >11 do(x)
  11. 12
  12. 13
  13. 14 if __name__ == '__main__':
  14. 15 main()
  15. (gdb) py-print x
  16. local 'x' = 12
  17. (gdb)

可以通过 py-down 回去:

  1. (gdb) py-down
  2. #6 Frame 0xb74926e4, for file test.py, line 6, in do (x=12)
  3. time.sleep(10)
  4. (gdb) py-list
  5. 1 # -*- coding: utf-8 -*-
  6. 2 import time
  7. 3
  8. 4
  9. 5 def do(x):
  10. >6 time.sleep(10)
  11. 7
  12. 8
  13. 9 def main():
  14. 10 for x in range(10000):
  15. 11 do(x)
  16. (gdb)

调试多线程程序

测试程序 test2.py:

  1. # -*- coding: utf-8 -*-
  2. from threading import Thread
  3. import time
  4. def do(x):
  5. x = x * 3
  6. time.sleep(x * 60)
  7. def main():
  8. threads = []
  9. for x in range(1, 3):
  10. t = Thread(target=do, args=(x,))
  11. t.start()
  12. for x in threads:
  13. x.join()
  14. if __name__ == '__main__':
  15. main()
  1. $ python test2.py &
  2. [2] 12281

查看所有线程

info threads

  1. $ gdb python core.12281
  2. (gdb) info threads
  3. Id Target Id Frame
  4. * 1 Thread 0xb74b9700 (LWP 11039) 0xb7711c31 in __kernel_vsyscall ()
  5. 2 Thread 0xb73b8b40 (LWP 11040) 0xb7711c31 in __kernel_vsyscall ()
  6. 3 Thread 0xb69ffb40 (LWP 11041) 0xb7711c31 in __kernel_vsyscall ()
  7. (gdb)

可以看到这个程序当前有 3 个线程, 当前进入的是 1 号线程。

切换线程

thread ID

  1. (gdb) thread 3
  2. [Switching to thread 3 (Thread 0xb69ffb40 (LWP 11041))]
  3. #0 0xb7711c31 in __kernel_vsyscall ()
  4. (gdb) info threads
  5. Id Target Id Frame
  6. 1 Thread 0xb74b9700 (LWP 11039) 0xb7711c31 in __kernel_vsyscall ()
  7. 2 Thread 0xb73b8b40 (LWP 11040) 0xb7711c31 in __kernel_vsyscall ()
  8. * 3 Thread 0xb69ffb40 (LWP 11041) 0xb7711c31 in __kernel_vsyscall ()
  9. (gdb)

现在切换到了 3 号线程。

可以通过前面所说的 py- 命令来查看当前线程的其他信息:

  1. [Current thread is 1 (Thread 0xb74b9700 (LWP 11039))]
  2. (gdb) py-list
  3. 335 waiter.acquire()
  4. 336 self.__waiters.append(waiter)
  5. 337 saved_state = self._release_save()
  6. 338 try: # restore state no matter what (e.g., KeyboardInterrupt)
  7. 339 if timeout is None:
  8. >340 waiter.acquire()
  9. 341 if __debug__:
  10. 342 self._note("%s.wait(): got it", self)
  11. 343 else:
  12. 344 # Balancing act: We can't afford a pure busy loop, so we
  13. 345 # have to sleep; but if we sleep the whole timeout time,
  14. (gdb) thread 2
  15. [Switching to thread 2 (Thread 0xb73b8b40 (LWP 11040))]
  16. #0 0xb7711c31 in __kernel_vsyscall ()
  17. (gdb) py-list
  18. 3 import time
  19. 4
  20. 5
  21. 6 def do(x):
  22. 7 x = x * 3
  23. >8 time.sleep(x * 60)
  24. 9
  25. 10
  26. 11 def main():
  27. 12 threads = []
  28. 13 for x in range(1, 3):
  29. (gdb)

同时操作所有线程

thread apply all CMD 或 t a a CMD

  1. (gdb) thread apply all py-list
  2. Thread 3 (Thread 0xb69ffb40 (LWP 11041)):
  3. 3 import time
  4. 4
  5. 5
  6. 6 def do(x):
  7. 7 x = x * 3
  8. >8 time.sleep(x * 60)
  9. 9
  10. 10
  11. 11 def main():
  12. 12 threads = []
  13. 13 for x in range(1, 3):
  14. Thread 2 (Thread 0xb73b8b40 (LWP 11040)):
  15. 3 import time
  16. 4
  17. 5
  18. 6 def do(x):
  19. 7 x = x * 3
  20. >8 time.sleep(x * 60)
  21. 9
  22. 10
  23. 11 def main():
  24. 12 threads = []
  25. 13 for x in range(1, 3):
  26. ---Type <return> to continue, or q <return> to quit---
  27. Thread 1 (Thread 0xb74b9700 (LWP 11039)):
  28. 335 waiter.acquire()
  29. 336 self.__waiters.append(waiter)
  30. 337 saved_state = self._release_save()
  31. 338 try: # restore state no matter what (e.g., KeyboardInterrupt)
  32. 339 if timeout is None:
  33. >340 waiter.acquire()
  34. 341 if __debug__:
  35. 342 self._note("%s.wait(): got it", self)
  36. 343 else:
  37. 344 # Balancing act: We can't afford a pure busy loop, so we
  38. 345 # have to sleep; but if we sleep the whole timeout time,
  39. (gdb)

常用的 gdb python 相关的操作就是这些, 同时也不要忘记原来的 gdb 命令都是可以使用的哦。

在Linux 系统上安装 gdb 和 python debug 扩展

你需要在系统上安装 gdb 和 python debug 扩展。 debug 扩展包括调试符号,并将Python特定的命令添加到 gdb中。 可以使用以下命令轻松安装:

  • Fedora

    sudo yum install gdb python-debuginfo
    
  • Ubuntu

    sudo apt-get install gdb python2.7-dbg
    
  • Centos

    sudo yum install yum-utils
    sudo debuginfo-install glibc
    sudo yum install gdb python-debuginfo
    

    对于 Centos 7 需要执行前面两条命令才能安装 python-debuginfo