gops (Go Process Status)是Go团队提供的命令行工具,它可以用来获取go进程运行时信息。
可以查看:

  • 当前有哪些go语言进程,哪些使用gops的go进程
  • 进程的概要信息
  • 进程的调用栈
  • 进程的内存使用情况
  • 构建程序的Go版本
  • 运行时统计信息

可以获取:

  • trace
  • cpu profile和memory profile

还可以:

  • 让进程进行1次GC
  • 设置GC百分比

    示例代码

    使用Options配置agent。 ```bash package main

import ( “log” “runtime” “time”

  1. "github.com/google/gops/agent"

)

func main() { if err := agent.Listen(agent.Options{ Addr: “0.0.0.0:8848”, // ConfigDir: “/home/centos/gopsconfig”, // 最好使用默认 ShutdownCleanup: true}); err != nil { log.Fatal(err) }

  1. // 测试代码
  2. _ = make([]int, 1000, 1000)
  3. runtime.GC()
  4. _ = make([]int, 1000, 2000)
  5. runtime.GC()
  6. time.Sleep(time.Hour)

}

  1. <a name="kWnFn"></a>
  2. ## agent Option选项
  3. agent有3个配置:
  4. - Addr:agent要监听的ip和端口,默认ip为环回地址,端口随机分配。
  5. - ConfigDir:该目录存放的不是agent的配置,而是每一个使用了agent的go进程信息,文件以pid命名,内容是该pid进程所监听的端口号,所以其中文件的目的是形成pid到端口的映射。默认值为~/.config/gops
  6. - ShutdownCleanup:进程退出时,是否清理ConfigDir中的文件,默认值为false,不清理
  7. 通常可以把Addr设置为要监听的IP,把ShutdownCleanup设置为ture,进程退出后,残留在ConfigDir目录的文件不再有用,最好清除掉。<br />ConfigDir示例:
  8. ```bash
  9. // gopsconfig为设置的ConfigDir目录,2051为pid,8848为端口号。
  10. ➜ ~ cat gopsconfig/2051
  11. 8848%
  12. ➜ ~ netstat -nap | grep `pgrep gopsexample`
  13. tcp6 0 0 :::8848 :::* LISTEN 2051/./gopsexample

gops原理

gops的原理是,代码中导入gops/agent,建立agent服务,gops命令连接agent读取进程信息。
image.png
agent的实现原理可以查看agent/handle函数
使用go标准库中原生接口实现相关功能,如同你要在自己的程序中开启pprof类似,只不过这部分功能由gops/agent实现了:

  • 使用runtime.MemStats获取内存情况
  • 使用runtime/pprof获取调用栈、cpu profile和memory profile
  • 使用runtime/trace获取trace
  • 使用runtime获取stats信息
  • 使用runtime/debug、GC设置和启动GC

再谈ConfigDir。从源码上看,ConfigDir对agent并没有用途,对gops有用。当gops和ConfigDir在一台机器上时,即gops查看本机的go进程信息,gops可以通过其中的文件,快速找到agent服务的端口。能够实现:gops pid到gops 127.0.0.1:port的转换。
如果代码中通过ConfigDir指定了其他目录,使用gops时,需要添加环境变量GOPS_CONFIG_DIR指向ConfigDir使用的目录。

子命令介绍

gops后面可以跟子命令,然后是pid或者远端地址。
也可以直接跟pid,查看本机进程信息。

  1. ~ gops help memstats
  2. gops is a tool to list and diagnose Go processes.
  3. Usage:
  4. gops <cmd> <pid|addr> ...
  5. gops <pid> # displays process info
  6. gops help # displays this help message
  7. Commands:
  8. stack Prints the stack trace.
  9. gc Runs the garbage collector and blocks until successful.
  10. setgc Sets the garbage collection target percentage.
  11. memstats Prints the allocation and garbage collection stats.
  12. version Prints the Go version used to build the program.
  13. stats Prints runtime stats.
  14. trace Runs the runtime tracer for 5 secs and launches "go tool trace".
  15. pprof-heap Reads the heap profile and launches "go tool pprof".
  16. pprof-cpu Reads the CPU profile and launches "go tool pprof".
  17. All commands require the agent running on the Go process.
  18. "*" indicates the process is running the agent.

查看当前机器上go程序进程信息

查看当前机器上的go进程,可以列出pid、ppid、进程名、可执行程序所使用的go版本,以及可执行程序的路径。

  1. ~ gops
  2. 67292 66333 gops * go1.13 /Users/shitaibin/Workspace/golang_step_by_step/gops/gops
  3. 67434 65931 gops go1.13 /Users/shitaibin/go/bin/gops
  4. 66551 1 gocode go1.11.2 /Users/shitaibin/go/bin/gocode
  5. 137 1 com.docker.vmnetd go1.12.7 /Library/PrivilegedHelperTools/com.docker.vmnetd
  6. 811 807 com.docker.backend go1.12.13 /Applications/Docker.app/Contents/MacOS/com.docker.backend
  7. 807 746 com.docker.supervisor go1.12.13 /Applications/Docker.app/Contents/MacOS/com.docker.supervisor
  8. 810 807 com.docker.driver.amd64-linux go1.12.13 /Applications/Docker.app/Contents/MacOS/com.docker.driver.amd64-linux

的是程序中使用了gops/agent,不带的是普通的go程序。

go程序进程树

查看进程树:

  1. ~ gops tree
  2. ...
  3. ├── 66333
  4. └── [*] 67292 (gops) {go1.13}
  5. ├── 1
  6. ├── 66551 (gocode) {go1.11.2}
  7. └── 137 (com.docker.vmnetd) {go1.12.7}
  8. ├── 65931
  9. └── 67476 (gops) {go1.13}
  10. └── 746
  11. └── 807 (com.docker.supervisor) {go1.12.13}
  12. ├── 811 (com.docker.backend) {go1.12.13}
  13. └── 810 (com.docker.driver.amd64-linux) {go1.12.13}

pid:进程概要信息

查看进程的概要信息,非gops进程也可以:

  1. ~ gops 67292
  2. parent PID: 66333
  3. threads: 7
  4. memory usage: 0.018%
  5. cpu usage: 0.000%
  6. username: shitaibin
  7. cmd+args: ./gops
  8. elapsed time: 11:28
  9. local/remote: 127.0.0.1:54753 <-> :0 (LISTEN)
  10. ~
  11. ~ gops 807
  12. parent PID: 746
  13. threads: 28
  14. memory usage: 0.057%
  15. cpu usage: 0.003%
  16. username: shitaibin
  17. cmd+args: /Applications/Docker.app/Contents/MacOS/com.docker.supervisor -watchdog fd:0
  18. elapsed time: 27-23:36:35
  19. local/remote: 127.0.0.1:54832 <-> :0 ()
  20. local/remote: *:53849 <-> :0 ()
  21. local/remote: 127.0.0.1:49473 <-> :0 (LISTEN)

stack:当前调用栈

查看使用gops的进程的调用栈:

  1. ~ gops stack 67292
  2. goroutine 19 [running]:
  3. runtime/pprof.writeGoroutineStacks(0x1197160, 0xc00009c028, 0x0, 0x0)
  4. /Users/shitaibin/goroot/src/runtime/pprof/pprof.go:679 +0x9d
  5. runtime/pprof.writeGoroutine(0x1197160, 0xc00009c028, 0x2, 0x0, 0x0)
  6. /Users/shitaibin/goroot/src/runtime/pprof/pprof.go:668 +0x44
  7. runtime/pprof.(*Profile).WriteTo(0x1275c60, 0x1197160, 0xc00009c028, 0x2, 0xc00009c028, 0x0)
  8. /Users/shitaibin/goroot/src/runtime/pprof/pprof.go:329 +0x3da
  9. github.com/google/gops/agent.handle(0x1665008, 0xc00009c028, 0xc000014068, 0x1, 0x1, 0x0, 0x0)
  10. /Users/shitaibin/go/src/github.com/google/gops/agent/agent.go:185 +0x1ab
  11. github.com/google/gops/agent.listen()
  12. /Users/shitaibin/go/src/github.com/google/gops/agent/agent.go:133 +0x2bf
  13. created by github.com/google/gops/agent.Listen
  14. /Users/shitaibin/go/src/github.com/google/gops/agent/agent.go:111 +0x364
  15. goroutine 1 [sleep]:
  16. runtime.goparkunlock(...)
  17. /Users/shitaibin/goroot/src/runtime/proc.go:310
  18. time.Sleep(0x34630b8a000)
  19. /Users/shitaibin/goroot/src/runtime/time.go:105 +0x157
  20. main.main()
  21. /Users/shitaibin/Workspace/golang_step_by_step/gops/example.go:15 +0xa3
  22. goroutine 18 [syscall]:
  23. os/signal.signal_recv(0x0)
  24. /Users/shitaibin/goroot/src/runtime/sigqueue.go:144 +0x96
  25. os/signal.loop()
  26. /Users/shitaibin/goroot/src/os/signal/signal_unix.go:23 +0x22
  27. created by os/signal.init.0
  28. /Users/shitaibin/goroot/src/os/signal/signal_unix.go:29 +0x41

memstats: 内存使用情况

查看gops进程内存使用情况:

  1. ~ gops memstats 67944
  2. alloc: 136.80KB (140088 bytes) // 当前分配出去未收回的内存总量
  3. total-alloc: 152.08KB (155728 bytes) // 已分配出去的内存总量
  4. sys: 67.25MB (70518784 bytes) // 当前进程从OS获取的内存总量
  5. lookups: 0
  6. mallocs: 418 // 分配的对象数量
  7. frees: 82 // 释放的对象数量
  8. heap-alloc: 136.80KB (140088 bytes) // 当前分配出去未收回的堆内存总量
  9. heap-sys: 63.56MB (66650112 bytes) // 当前堆从OS获取的内存
  10. heap-idle: 62.98MB (66035712 bytes) // 当前堆中空闲的内存量
  11. heap-in-use: 600.00KB (614400 bytes) // 当前堆使用中的内存量
  12. heap-released: 62.89MB (65945600 bytes)
  13. heap-objects: 336 // 堆中对象数量
  14. stack-in-use: 448.00KB (458752 bytes) // 栈使用中的内存量
  15. stack-sys: 448.00KB (458752 bytes) // 栈从OS获取的内存总量
  16. stack-mspan-inuse: 10.89KB (11152 bytes)
  17. stack-mspan-sys: 16.00KB (16384 bytes)
  18. stack-mcache-inuse: 13.56KB (13888 bytes)
  19. stack-mcache-sys: 16.00KB (16384 bytes)
  20. other-sys: 1.01MB (1062682 bytes)
  21. gc-sys: 2.21MB (2312192 bytes)
  22. next-gc: when heap-alloc >= 4.00MB (4194304 bytes) // 下次GC的条件
  23. last-gc: 2020-03-16 10:06:26.743193 +0800 CST // 上次GC的世界
  24. gc-pause-total: 83.84µs // GC总暂停时间
  25. gc-pause: 44891 // 上次GC暂停时间,单位纳秒
  26. num-gc: 2 // 已进行的GC次数
  27. enable-gc: true // 是否开始GC
  28. debug-gc: false

stats: 运行时信息

查看运行时统计信息:

  1. ~ gops stats 68125
  2. goroutines: 3
  3. OS threads: 12
  4. GOMAXPROCS: 8
  5. num CPU: 8

trace

获取当前运行5s的trace信息,会打开网页:

  1. ~ gops trace 68125
  2. Tracing now, will take 5 secs...
  3. Trace dump saved to: /var/folders/5g/rz16gqtx3nsdfs7k8sb80jth0000gn/T/trace116447431
  4. 2020/03/16 10:23:37 Parsing trace...
  5. 2020/03/16 10:23:37 Splitting trace...
  6. 2020/03/16 10:23:37 Opening browser. Trace viewer is listening on http://127.0.0.1:55480

cpu profile

获取cpu profile,并进入交互模式:

  1. ~ gops pprof-cpu 68125
  2. Profiling CPU now, will take 30 secs...
  3. Profile dump saved to: /var/folders/5g/rz16gqtx3nsdfs7k8sb80jth0000gn/T/profile431166544
  4. Binary file saved to: /var/folders/5g/rz16gqtx3nsdfs7k8sb80jth0000gn/T/binary765361519
  5. File: binary765361519
  6. Type: cpu
  7. Time: Mar 16, 2020 at 10:25am (CST)
  8. Duration: 30s, Total samples = 0
  9. No samples were found with the default sample value type.
  10. Try "sample_index" command to analyze different sample values.
  11. Entering interactive mode (type "help" for commands, "o" for options)
  12. (pprof)
  13. (pprof) top
  14. Showing nodes accounting for 0, 0% of 0 total
  15. flat flat% sum% cum cum%

memory profile

获取memory profile,并进入交互模式:

  1. ~ gops pprof-heap 68125
  2. Profile dump saved to: /var/folders/5g/rz16gqtx3nsdfs7k8sb80jth0000gn/T/profile292136242
  3. Binary file saved to: /var/folders/5g/rz16gqtx3nsdfs7k8sb80jth0000gn/T/binary693335273
  4. File: binary693335273
  5. Type: inuse_space
  6. Time: Mar 16, 2020 at 10:27am (CST)
  7. No samples were found with the default sample value type.
  8. Try "sample_index" command to analyze different sample values.
  9. Entering interactive mode (type "help" for commands, "o" for options)
  10. (pprof)
  11. (pprof) traces
  12. File: binary693335273
  13. Type: inuse_space
  14. Time: Mar 16, 2020 at 10:27am (CST)
  15. -----------+-------------------------------------------------------
  16. bytes: 256kB
  17. 0 compress/flate.(*compressor).init
  18. compress/flate.NewWriter
  19. compress/gzip.(*Writer).Write
  20. runtime/pprof.(*profileBuilder).build
  21. runtime/pprof.profileWriter
  22. -----------+-------------------------------------------------------
  23. bytes: 64kB
  24. 0 compress/flate.newDeflateFast
  25. compress/flate.(*compressor).init
  26. compress/flate.NewWriter
  27. compress/gzip.(*Writer).Write
  28. runtime/pprof.(*profileBuilder).build
  29. runtime/pprof.profileWriter
  30. -----------+-------------------------------------------------------

使用远程连接

agent的默认配置Option{},监听的是环回地址。

  1. ~ sudo netstat -nap | grep 414
  2. ~ netstat -nap | grep `pgrep gopsexample`
  3. (Not all processes could be identified, non-owned process info
  4. will not be shown, you would have to be root to see it all.)
  5. tcp 0 0 127.0.0.1:36812 0.0.0.0:* LISTEN 414/./gopsexample

修改程序,在Option中设置监听的地址和端口:

  1. agent.Listen(agent.Options{Addr:"0.0.0.0:8848"})

在远程主机上重新编译、重启进程,确认进程监听的端口:

  1. ~ netstat -nap | grep `pgrep gopsexample`
  2. (Not all processes could be identified, non-owned process info
  3. will not be shown, you would have to be root to see it all.)
  4. tcp6 0 0 :::8848 :::* LISTEN 887/./gopsexample

在本地主机上使用gops连接远端go进程,并查看数据:

  1. ~ gops stats 192.168.9.137:8848
  2. goroutines: 3
  3. OS threads: 9
  4. GOMAXPROCS: 4
  5. num CPU: 4

gops后面只能跟pid查看进程简要信息,不能跟ip和port查看远端go进程简要信息,这些简要信息可以通过子命令汇集起来。

  1. ~ gops 192.168.9.137:8848
  2. gops: unknown subcommand
  3. ~
  4. ~ gops version 192.168.9.137:8848
  5. go1.13