5.1. 网络

以下各节的脚本展示了如何跟踪网络相关的函数和剖析(profile)网络活动。

剖析网络活动

本节展示SystemTap中剖析网络活动的方式。下面的nettop.stp允许我们一窥每个进程的网络流量使用情况。

nettop.stp

  1. #! /usr/bin/env stap
  2. global ifxmit, ifrecv
  3. global ifmerged
  4. probe netdev.transmit
  5. {
  6. ifxmit[pid(), dev_name, execname(), uid()] <<< length
  7. }
  8. probe netdev.receive
  9. {
  10. ifrecv[pid(), dev_name, execname(), uid()] <<< length
  11. }
  12. function print_activity()
  13. {
  14. printf("%5s %5s %-7s %7s %7s %7s %7s %-15s\n",
  15. "PID", "UID", "DEV", "XMIT_PK", "RECV_PK",
  16. "XMIT_KB", "RECV_KB", "COMMAND")
  17. foreach ([pid, dev, exec, uid] in ifrecv) {
  18. ifmerged[pid, dev, exec, uid] += @count(ifrecv[pid,dev,exec,uid]);
  19. }
  20. foreach ([pid, dev, exec, uid] in ifxmit) {
  21. ifmerged[pid, dev, exec, uid] += @count(ifxmit[pid,dev,exec,uid]);
  22. }
  23. foreach ([pid, dev, exec, uid] in ifmerged-) {
  24. n_xmit = @count(ifxmit[pid, dev, exec, uid])
  25. n_recv = @count(ifrecv[pid, dev, exec, uid])
  26. printf("%5d %5d %-7s %7d %7d %7d %7d %-15s\n",
  27. pid, uid, dev, n_xmit, n_recv,
  28. n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0,
  29. n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0,
  30. exec)
  31. }
  32. print("\n")
  33. delete ifxmit
  34. delete ifrecv
  35. delete ifmerged
  36. }
  37. probe timer.ms(5000), end, error
  38. {
  39. print_activity()
  40. }

注意看print_activity()的这几个表达式:

  1. n_xmit ? @sum(ifxmit[pid, dev, exec, uid])/1024 : 0
  2. n_recv ? @sum(ifrecv[pid, dev, exec, uid])/1024 : 0

它们也是if/else语句,等价于如下的伪代码:

  1. if n_recv != 0 then
  2. @sum(ifrecv[pid, dev, exec, uid])/1024
  3. else
  4. 0

nettop.stp跟踪用了网络流量的进程,并逐个进程输出如下的信息:

  • PID — 进程的PID.
  • UID — 进程所有者的UID。
  • DEV — 进程使用的端口,如eth0eth1
  • XMIT_PK — 发送的包的数量
  • RECV_PK — 接收的包的数量
  • XMIT_KB — 发送的KB数
  • RECV_KB — 接收的KB数

nettop.stp每隔5秒就会取样一次。你可以修改probe timer.ms(5000)来调整取样间隔。nettop.stp在20秒内的输出如下:

  1. [...]
  2. PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
  3. 0 0 eth0 0 5 0 0 swapper
  4. 11178 0 eth0 2 0 0 0 synergyc
  5. PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
  6. 2886 4 eth0 79 0 5 0 cups-polld
  7. 11362 0 eth0 0 61 0 5 firefox
  8. 0 0 eth0 3 32 0 3 swapper
  9. 2886 4 lo 4 4 0 0 cups-polld
  10. 11178 0 eth0 3 0 0 0 synergyc
  11. PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
  12. 0 0 eth0 0 6 0 0 swapper
  13. 2886 4 lo 2 2 0 0 cups-polld
  14. 11178 0 eth0 3 0 0 0 synergyc
  15. 3611 0 eth0 0 1 0 0 Xorg
  16. PID UID DEV XMIT_PK RECV_PK XMIT_KB RECV_KB COMMAND
  17. 0 0 eth0 3 42 0 2 swapper
  18. 11178 0 eth0 43 1 3 0 synergyc
  19. 11362 0 eth0 0 7 0 0 firefox
  20. 3897 0 eth0 0 1 0 0 multiload-apple
  21. [...]

跟踪网络连接中的内核函数调用

本节展示如何跟踪内核的net/socket.c中的函数的调用情况。这将帮助你从细节上看清各进程是怎么跟内核的网络功能打交道的。

socket-trace.stp

  1. #! /usr/bin/env stap
  2. probe kernel.function("*@net/socket.c").call {
  3. printf ("%s -> %s\n", thread_indent(1), ppfunc())
  4. }
  5. probe kernel.function("*@net/socket.c").return {
  6. printf ("%s <- %s\n", thread_indent(-1), ppfunc())
  7. }

socket-trace.stp这个脚本其实在我们之前在第3章介绍thread_indent()的时候已经见过了。下面是它在3秒内的输出:

  1. [...]
  2. 0 Xorg(3611): -> sock_poll
  3. 3 Xorg(3611): <- sock_poll
  4. 0 Xorg(3611): -> sock_poll
  5. 3 Xorg(3611): <- sock_poll
  6. 0 gnome-terminal(11106): -> sock_poll
  7. 5 gnome-terminal(11106): <- sock_poll
  8. 0 scim-bridge(3883): -> sock_poll
  9. 3 scim-bridge(3883): <- sock_poll
  10. 0 scim-bridge(3883): -> sys_socketcall
  11. 4 scim-bridge(3883): -> sys_recv
  12. 8 scim-bridge(3883): -> sys_recvfrom
  13. 12 scim-bridge(3883):-> sock_from_file
  14. 16 scim-bridge(3883):<- sock_from_file
  15. 20 scim-bridge(3883):-> sock_recvmsg
  16. 24 scim-bridge(3883):<- sock_recvmsg
  17. 28 scim-bridge(3883): <- sys_recvfrom
  18. 31 scim-bridge(3883): <- sys_recv
  19. 35 scim-bridge(3883): <- sys_socketcall
  20. [...]

监控TCP连接的创建

本节展示如何监控TCP连接的创建。这可以帮助你第一时间识别出任何未授权的、可疑的或其它不请自来的网络连接。

tcp_connections.stp

  1. #! /usr/bin/env stap
  2. probe begin {
  3. printf("%6s %16s %6s %6s %16s\n",
  4. "UID", "CMD", "PID", "PORT", "IP_SOURCE")
  5. }
  6. probe kernel.function("tcp_accept").return?,
  7. kernel.function("inet_csk_accept").return? {
  8. sock = $return
  9. if (sock != 0)
  10. printf("%6d %16s %6d %6d %16s\n", uid(), execname(), pid(),
  11. inet_get_local_port(sock), inet_get_ip_source(sock))
  12. }

tcp_connections.stp运行时,它会实时输出新创建的TCP连接的如下信息:

  • 当前UID
  • 接受连接的程序名
  • 接受连接的进程PID
  • 创建连接的远程IP地址
  1. UID CMD PID PORT IP_SOURCE
  2. 0 sshd 3165 22 10.64.0.227
  3. 0 sshd 3165 22 10.64.0.227

监控TCP包

本节展示如何监控收到的TCP包。这可以帮助你分析应用的流量使用情况。

tcpdumplike.stp

  1. #! /usr/bin/env stap
  2. // A TCP dump like example
  3. probe begin, timer.s(1) {
  4. printf("-----------------------------------------------------------------\n")
  5. printf(" Source IP Dest IP SPort DPort U A P R S F \n")
  6. printf("-----------------------------------------------------------------\n")
  7. }
  8. probe udp.recvmsg /* ,udp.sendmsg */ {
  9. printf(" %15s %15s %5d %5d UDP\n",
  10. saddr, daddr, sport, dport)
  11. }
  12. probe tcp.receive {
  13. printf(" %15s %15s %5d %5d %d %d %d %d %d %d\n",
  14. saddr, daddr, sport, dport, urg, ack, psh, rst, syn, fin)
  15. }

tcpdumplike.stp运行时,它会实时输出收到的TCP包的如下信息:

  • 源IP地址和目标IP地址(saddr和daddr)
  • 源端口和目标端口(sport和dport)
  • 包标识

tcpdumplike.stp使用了以下函数来获取包的标识信息:

  • urg - urgent
  • ack - acknowledgement
  • psh - push
  • rst - reset
  • syn - synchronize
  • fin - finished

上述函数返回1或0来表示包中是否存在对应的标识。 ⁠

  1. -----------------------------------------------------------------
  2. Source IP Dest IP SPort DPort U A P R S F
  3. -----------------------------------------------------------------
  4. 209.85.229.147 10.0.2.15 80 20373 0 1 1 0 0 0
  5. 92.122.126.240 10.0.2.15 80 53214 0 1 0 0 1 0
  6. 92.122.126.240 10.0.2.15 80 53214 0 1 0 0 0 0
  7. 209.85.229.118 10.0.2.15 80 63433 0 1 0 0 1 0
  8. 209.85.229.118 10.0.2.15 80 63433 0 1 0 0 0 0
  9. 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0
  10. 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0
  11. 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0
  12. 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0
  13. 209.85.229.147 10.0.2.15 80 21141 0 1 1 0 0 0
  14. 209.85.229.118 10.0.2.15 80 63433 0 1 1 0 0 0
  15. [...]

监控内核中的网络丢包情况

某些情况下Linux网络栈会丢包。有些版本的Linux内核包含静态内核探测点kernel.trace("kfree_skb"),它可以帮助你跟踪包丢掉的原因。dropwatch.stp就使用了它来跟踪丢包;这个脚本每五秒统计一次丢包的位置。

dropwatch.stp

  1. #! /usr/bin/env stap
  2. ############################################################
  3. # Dropwatch.stp
  4. # Author: Neil Horman <nhorman@redhat.com>
  5. # An example script to mimic the behavior of the dropwatch utility
  6. # http://fedorahosted.org/dropwatch
  7. ############################################################
  8. # Array to hold the list of drop points we find
  9. global locations
  10. # Note when we turn the monitor on and off
  11. probe begin { printf("Monitoring for dropped packets\n") }
  12. probe end { printf("Stopping dropped packet monitor\n") }
  13. # increment a drop counter for every location we drop at
  14. probe kernel.trace("kfree_skb") { locations[$location] <<< 1 }
  15. # Every 5 seconds report our drop locations
  16. probe timer.sec(5)
  17. {
  18. printf("\n")
  19. foreach (l in locations-) {
  20. printf("%d packets dropped at %s\n",
  21. @count(locations[l]), symname(l))
  22. }
  23. delete locations
  24. }

kernel.trace("kfree_skb")跟踪内核中网络包被丢弃的位置。它有两个参数:一个指向将被释放的缓冲区的指针$skb,和释放缓冲区时的内核位置$location。如果可以获取$location所存储的内核地址上对应的函数名,dropwatch.stp脚本可以把它的值映射成对应的函数。这个映射默认不会启用。对于1.4及以上的SystemTap,你可以指定--all-modules选项来启用该映射:

  1. stap --all-modules dropwatch.stp

在低版本的SystemTap,你可以使用下面的命令模拟--all-modules选项:

  1. stap -dkernel \
  2. `cat /proc/modules | awk 'BEGIN { ORS = " " } {print "-d"$1}'` \
  3. dropwatch.stp

运行dropwatch.stp15秒会输出类似下面的结果。输出的结果会按函数名或地址聚合丢包的次数。

  1. Monitoring for dropped packets
  2. 1762 packets dropped at unix_stream_recvmsg
  3. 4 packets dropped at tun_do_read
  4. 2 packets dropped at nf_hook_slow
  5. 467 packets dropped at unix_stream_recvmsg
  6. 20 packets dropped at nf_hook_slow
  7. 6 packets dropped at tun_do_read
  8. 446 packets dropped at unix_stream_recvmsg
  9. 4 packets dropped at tun_do_read
  10. 4 packets dropped at nf_hook_slow
  11. Stopping dropped packet monitor

当运行脚本的机器不支持--all-modules/proc/modules时,symname只会输出原始的地址。你可以通过/boot/System.map-$(uname -r)按地址找出对应的函数。下面的/boot/System.map-$(uname -r)片段中,地址0xffffffff8149a8ed映射到函数unix_stream_recvmsg

  1. [...]
  2. ffffffff8149a420 t unix_dgram_poll
  3. ffffffff8149a5e0 t unix_stream_recvmsg
  4. ffffffff8149ad00 t unix_find_other
  5. [...]