你还在用 GDB 调试程序吗?

如果是,那么我们是同道中人。但是你知道 GDB 有一个很强大的功能,Python scripting 嘛?

如果是的,那么恭喜你,你是一个大牛。

本文主要讲述如何使用 Python 来提高你的 gdb 调试技能, 让你从繁重的重复的工作里面挣脱出来呼吸新鲜空气。

首先,第一件事,使用 gdb7.x 以上的版本,最好 9.x 的。因为 Python 的支持是从 gdb7.0(2009 年?)开始的。

进入正题

gdb 本来就支持自定义脚本辅助调试,为什么还要用 Python 脚本呢?因为自定义脚本的语法比较老,不如写 Python 欢快。如果你喜欢用原来的自定义脚本方法,那也是可以的。

借助 Python,你可以将难看的数据变得好看,

借助 Python,你可以将重复的工作变成一个命令,

借助 Python,你可以更快的调试 bug,

借助 Python,你可以装逼,哈哈哈

……

将难看的数据变得好看

以下面的代码为例

  1. #include <map>
  2. #include <iostream>
  3. #include <string>
  4. using namespace std;
  5. int main() {
  6. std::map<string, string> lm;
  7. lm["good"] = "heart";
  8. // 查看map 里面内容
  9. std::cout<<lm["good"];
  10. }

当代码运行到 std<<cout 时, 你想查看 map 里面的内容,如果没有 python 和自定义的脚本,print lm 看到的是

  1. $2 = {_M_t = {
  2. _M_impl = {<std::allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >> = {<__gnu_cxx::new_allocator<std::_Rb_tree_node<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > >> = {<No data fields>}, <No data fields>}, <std::_Rb_tree_key_compare<std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >> = {
  3. _M_key_compare = {<std::binary_function<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool>> = {<No data fields>}, <No data fields>}}, <std::_Rb_tree_header> = {_M_header = {
  4. _M_color = std::_S_red, _M_parent = 0x55555556eeb0,
  5. _M_left = 0x55555556eeb0, _M_right = 0x55555556eeb0},
  6. _M_node_count = 1}, <No data fields>}}}

但是当你在 gdb9.2 里面输入 print lm 的时候,你看到的将是

  1. (gdb) p lm
  2. $3 = std::map with 1 element = {["good"] = "heart"}

map 里面有什么一清二楚。这是因为 gdb9.x 自带了一系列标准库的 Python pretty priniter。 如果你使用的是 gdb7.x,那么你可以手动的导入这些 pretty printer 实现同样的效果。具体步骤如下:

  1. 下载 pretty printer: svn co svn://http://gcc.gnu.org/svn/gcc/trunk/libstdc++-v3/python
  2. 在 gdb 里面输入 (将路径改成你下载的路径):
  1. python
  2. import sys
  3. sys.path.insert(0, '/home/maude/gdb_printers/python')
  4. from libstdcxx.v6.printers import register_libstdcxx_printers
  5. register_libstdcxx_printers (None)
  6. end

这样你就可以放心使用了~

详细请看:

https://sourceware.org/gdb/wiki/STLSupport

https://codeyarns.com/2014/07/17/how-to-enable-pretty-printing-for-stl-in-gdb/

将重复的工作变成一个命令

比如在调试的时候,你知道当前栈指向一个字符串,但是你不知道具体在哪里,你想遍历这个栈将它找出来,那么你可以借助 Python 自定义一个命令 “stackwalk”,这个命令可以直接用 Python 代码遍历栈,将字符串找出来。

  1. #####################################################
  2. # Usage: to load this to gdb run:
  3. # (gdb) source ..../path/to/<script_file>.py
  4. import gdb
  5. class StackWalk(gdb.Command):
  6. def __init__(self):
  7. # This registers our class as "StackWalk"
  8. super(StackWalk, self).__init__("stackwalk", gdb.COMMAND_DATA)
  9. def invoke(self, arg, from_tty):
  10. # When we call "StackWalk" from gdb, this is the method
  11. # that will be called.
  12. print("Hello from StackWalk!")
  13. # get the register
  14. rbp = gdb.parse_and_eval('$rbp')
  15. rsp = gdb.parse_and_eval('$rsp')
  16. ptr = rsp
  17. ppwc = gdb.lookup_type('wchar_t').pointer().pointer()
  18. while ptr < rbp:
  19. try:
  20. print('pointer is {}'.format(ptr))
  21. print(gdb.execute('wc_print {}'.format(ptr.cast(ppwc).dereference())))
  22. print('===')
  23. except:
  24. pass
  25. ptr += 8
  26. # This registers our class to the gdb runtime at "source" time.
  27. StackWalk()

Note: wc_print 是我写的另外一个简单 Python 命令,用于打印给定地址的宽字符串,具体实现留作习题~

更快的调试 bug

当你调试多线程的时候,你发现 callstack 一堆,而且好多都是重复的,如果它们可以自动去重或者折叠多好,这样你只需要关注一小部分。好消息!Python 可以让你用一个命令就可以轻松搞定。而且已经有人写好了相应的代码,你只需要导入即可。详细介绍请看https://fy.blackhats.net.au/blog/html/2017/08/04/so_you_want_to_script_gdb_with_python.html

  1. # From https://fy.blackhats.net.au/blog/html/2017/08/04/so_you_want_to_script_gdb_with_python.html
  2. #####################################################
  3. #
  4. # Usage: to load this to gdb run:
  5. # (gdb) source ..../path/to/debug_naughty.py
  6. #
  7. # To have this automatically load, you need to put the script
  8. # in a path related to your binary. If you make /usr/sbin/foo,
  9. # You can ship this script as:
  10. # /usr/share/gdb/auto-load/ <PATH TO BINARY>
  11. # /usr/share/gdb/auto-load/usr/sbin/foo
  12. #
  13. # This will trigger gdb to autoload the script when you start
  14. # to acces a core or the live binary from this location.
  15. #
  16. import gdb
  17. class StackFold(gdb.Command):
  18. def __init__(self):
  19. super(StackFold, self).__init__("stackfold", gdb.COMMAND_DATA)
  20. def invoke(self, arg, from_tty):
  21. # An inferior is the 'currently running applications'. In this case we only
  22. # have one.
  23. stack_maps = {}
  24. # This creates a dict where each element is keyed by backtrace.
  25. # Then each backtrace contains an array of "frames"
  26. #
  27. inferiors = gdb.inferiors()
  28. for inferior in inferiors:
  29. for thread in inferior.threads():
  30. try:
  31. # Change to our threads context
  32. thread.switch()
  33. # Get the thread IDS
  34. (tpid, lwpid, tid) = thread.ptid
  35. gtid = thread.num
  36. # Take a human readable copy of the backtrace, we'll need this for display later.
  37. o = gdb.execute('bt', to_string=True)
  38. # Build the backtrace for comparison
  39. backtrace = []
  40. gdb.newest_frame()
  41. cur_frame = gdb.selected_frame()
  42. while cur_frame is not None:
  43. if cur_frame.name() is not None:
  44. backtrace.append(cur_frame.name())
  45. cur_frame = cur_frame.older()
  46. # Now we have a backtrace like ['pthread_cond_wait@@GLIBC_2.3.2', 'lazy_thread', 'start_thread', 'clone']
  47. # dicts can't use lists as keys because they are non-hashable, so we turn this into a string.
  48. # Remember, C functions can't have spaces in them ...
  49. s_backtrace = ' '.join(backtrace)
  50. # Let's see if it exists in the stack_maps
  51. if s_backtrace not in stack_maps:
  52. stack_maps[s_backtrace] = []
  53. # Now lets add this thread to the map.
  54. stack_maps[s_backtrace].append({'gtid': gtid, 'tpid' : tpid, 'bt': o} )
  55. except Exception as e:
  56. print(e)
  57. # Now at this point we have a dict of traces, and each trace has a "list" of pids that match. Let's display them
  58. for smap in stack_maps:
  59. # Get our human readable form out.
  60. o = stack_maps[smap][0]['bt']
  61. for t in stack_maps[smap]:
  62. # For each thread we recorded
  63. print("Thread %s (LWP %s))" % (t['gtid'], t['tpid']))
  64. print(o)
  65. # This registers our class to the gdb runtime at "source" time.
  66. StackFold()

等等!还有好多,毕竟 Python 图灵完备,而且 GDB 提供了许多 API, 你想要啥基本都能实现。

会了这些,你就可以向新手装逼去了~

还想继续学习装逼机器,请看又一 debug 装逼技能:record, replay

或者关注我~

注:lldb 也支持 Python 扩展,所以同样的道理可以用于 lldb。

References:

1 https://undo.io/resources/gdb-watchpoint/python-gdb/

2 https://codeyarns.com/2014/07/17/how-to-enable-pretty-printing-for-stl-in-gdb/
https://zhuanlan.zhihu.com/p/152274203