5.4. 标识用户空间锁竞争

本节展示如何显示特定时间内用户空间锁竞争的情况。通过展示锁竞争的图景,你可以判断当前的性能问题是否由对futex的竞争所造成的。 简单地说,如果在同一时间内多个进程试图获取同一把锁,就会产生对futex的竞争。由于仅有一个进程可以持有锁,其他的进程都只能等待锁重新可用,锁竞争会导致性能的下降。 下面的futexes.stp脚本通过探测futex系统调用来显示锁竞争的情况:

futexes.stp

  1. #! /usr/bin/env stap
  2. # This script tries to identify contended user-space locks by hooking
  3. # into the futex system call.
  4. global FUTEX_WAIT = 0 /*, FUTEX_WAKE = 1 */
  5. global FUTEX_PRIVATE_FLAG = 128 /* linux 2.6.22+ */
  6. global FUTEX_CLOCK_REALTIME = 256 /* linux 2.6.29+ */
  7. global lock_waits # long-lived stats on (tid,lock) blockage elapsed time
  8. global process_names # long-lived pid-to-execname mapping
  9. probe syscall.futex.return {
  10. if (($op & ~(FUTEX_PRIVATE_FLAG|FUTEX_CLOCK_REALTIME)) != FUTEX_WAIT) next
  11. process_names[pid()] = execname()
  12. elapsed = gettimeofday_us() - @entry(gettimeofday_us())
  13. lock_waits[pid(), $uaddr] <<< elapsed
  14. }
  15. probe end {
  16. foreach ([pid+, lock] in lock_waits)
  17. printf ("%s[%d] lock %p contended %d times, %d avg us\n",
  18. process_names[pid], pid, lock, @count(lock_waits[pid,lock]),
  19. @avg(lock_waits[pid,lock]))
  20. }

futexes.stp需要手动Ctrl+C退出。一旦退出后,它会输出下面信息:

  • 参与锁竞争的进程的名字和ID
  • 被竞争的锁变量的地址
  • 锁被竞争的次数
  • 竞争锁的平均耗时

⁠下面是futexes.stp在运行约20秒 退出时,大致的输出情况:

  1. [...]
  2. automount[2825] lock 0x00bc7784 contended 18 times, 999931 avg us
  3. synergyc[3686] lock 0x0861e96c contended 192 times, 101991 avg us
  4. synergyc[3758] lock 0x08d98744 contended 192 times, 101990 avg us
  5. synergyc[3938] lock 0x0982a8b4 contended 192 times, 101997 avg us
  6. [...]