1. 性能分析工具

graphviz windows安装

graphviz-install-5.0.0-win64.zip 安装完成后,需要把文件夹中的文件拷贝到自己本地的GOPATH目录下。(我的:D:\GoPackage) go get github.com/uber/go-torch

image.png
image.png

go tool pprof .\cpu.prof

  1. Type: cpu
  2. Time: Aug 10, 2022 at 7:48pm (CST)
  3. Duration: 1.32s, Total samples = 750ms (56.65%)
  4. Entering interactive mode (type "help" for commands, "o" for options)

go tool pprof prof .\cpu.prof

  1. prof: open prof: The system cannot find the file specified.
  2. Fetched 1 source profiles out of 2
  3. Type: cpu
  4. Time: Aug 10, 2022 at 7:48pm (CST)
  5. Duration: 1.32s, Total samples = 750ms (56.65%)
  6. Entering interactive mode (type "help" for commands, "o" for options)

top

  1. # flat 函数执行时间及所占比例
  2. Showing top 10 nodes out of 13
  3. flat flat% sum% cum cum%
  4. 410ms 54.67% 54.67% 490ms 65.33% math/rand.(*Rand).Int31n
  5. 200ms 26.67% 81.33% 720ms 96.00% main.fillMatrix
  6. 60ms 8.00% 89.33% 60ms 8.00% math/rand.(*rngSource).Uint64 (inline)
  7. 30ms 4.00% 93.33% 520ms 69.33% math/rand.(*Rand).Intn
  8. 20ms 2.67% 96.00% 20ms 2.67% main.calculate (inline)
  9. 10ms 1.33% 97.33% 80ms 10.67% math/rand.(*Rand).Int31 (inline)
  10. 10ms 1.33% 98.67% 70ms 9.33% math/rand.(*rngSource).Int63
  11. 10ms 1.33% 100% 10ms 1.33% runtime/pprof.(*profMap).lookup
  12. 0 0% 100% 740ms 98.67% main.main
  13. 0 0% 100% 70ms 9.33% math/rand.(*Rand).Int63 (inline)

list fillMatrix

  1. Total: 750ms
  2. ROUTINE ======================== main.fillMatrix in D:\JetbrainsCode\Learn_Go\go-learning\src\ch40\tools\file\prof.go
  3. 200ms 720ms (flat, cum) 96.00% of Total
  4. . . 15:)
  5. . . 16:
  6. . . 17:func fillMatrix(m *[row][col]int) {
  7. . . 18: s := rand.New(rand.NewSource(time.Now().UnixNano()))
  8. . . 19:
  9. 10ms 10ms 20: for i := 0; i < row; i++ {
  10. 20ms 20ms 21: for j := 0; j < col; j++ {
  11. 170ms 690ms 22: m[i][j] = s.Intn(100000)
  12. . . 23: }
  13. . . 24: }
  14. . . 25:}
  15. . . 26:
  16. . . 27:func calculate(m *[row][col]int) {

svg 生成svg图片

profile001.svg

go tool pprof prof .\mem.prof top list main.main

  1. Total: 4.22MB
  2. ROUTINE ======================== main.main in D:\JetbrainsCode\Learn_Go\go-learning\src\ch40\tools\file\prof.go
  3. 0 1.72MB (flat, cum) 40.76% of Total
  4. . . 39: if err != nil {
  5. . . 40: log.Fatal("could not create CPU profile: ", err)
  6. . . 41: }
  7. . . 42:
  8. . . 43: // 获取系统信息
  9. . 1.72MB 44: if err := pprof.StartCPUProfile(f); err != nil { //监控cpu
  10. . . 45: log.Fatal("could not start CPU profile: ", err)
  11. . . 46: }
  12. . . 47: defer pprof.StopCPUProfile()
  13. . . 48:
  14. . . 49: // 主逻辑区,进行一些简单的代码运算

profile002.svg

go-torch 试了一个多小时,还是不行,略过…

image.png
image.png

2. 性能调优示例

go test -bench=”.” -cpuprofile=”cpu.prof”

  1. goos: windows
  2. goarch: amd64
  3. pkg: go-learning/src/ch41
  4. cpu: Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
  5. BenchmarkProcessRequest-8 183476 6594 ns/op
  6. BenchmarkProcessRequestOld-8 59528 19808 ns/op
  7. PASS
  8. ok go-learning/src/ch41 2.912s

go tool pprof .\cpu.prof

  1. Type: cpu
  2. Time: Aug 11, 2022 at 11:13am (CST)
  3. Duration: 2.85s, Total samples = 2.15s (75.50%)
  4. Entering interactive mode (type "help" for commands, "o" for options)

top

  1. Showing nodes accounting for 900ms, 41.86% of 2150ms total
  2. Dropped 67 nodes (cum <= 10.75ms)
  3. Showing top 10 nodes out of 131
  4. flat flat% sum% cum cum%
  5. 130ms 6.05% 6.05% 240ms 11.16% github.com/mailru/easyjson/jlexer.(*Lexer).FetchToken
  6. 110ms 5.12% 11.16% 350ms 16.28% runtime.concatstrings
  7. 110ms 5.12% 16.28% 190ms 8.84% strconv.ParseInt
  8. 100ms 4.65% 20.93% 100ms 4.65% runtime.memmove
  9. 90ms 4.19% 25.12% 280ms 13.02% runtime.mallocgc
  10. 80ms 3.72% 28.84% 640ms 29.77% go-learning/src/ch41.easyjson6a975c40DecodeGoLearningSrcCh411
  11. 80ms 3.72% 32.56% 80ms 3.72% strconv.ParseUint
  12. 70ms 3.26% 35.81% 70ms 3.26% runtime.stdcall2
  13. 70ms 3.26% 39.07% 70ms 3.26% runtime.stdcall3
  14. 60ms 2.79% 41.86% 120ms 5.58% encoding/json.checkValid

top -cum

  1. Showing nodes accounting for 0.09s, 4.19% of 2.15s total
  2. Dropped 67 nodes (cum <= 0.01s)
  3. Showing top 10 nodes out of 131
  4. flat flat% sum% cum cum%
  5. 0 0% 0% 1.59s 73.95% testing.(*B).launch
  6. 0 0% 0% 1.59s 73.95% testing.(*B).runN
  7. 0 0% 0% 0.88s 40.93% go-learning/src/ch41.BenchmarkProcessRequestOld
  8. 0 0% 0% 0.88s 40.93% go-learning/src/ch41.processRequestOld
  9. 0 0% 0% 0.71s 33.02% go-learning/src/ch41.BenchmarkProcessRequest
  10. 0.01s 0.47% 0.47% 0.71s 33.02% go-learning/src/ch41.processRequest
  11. 0 0% 0.47% 0.64s 29.77% go-learning/src/ch41.(*Request).UnmarshalJSON (partial-inline)
  12. 0.08s 3.72% 4.19% 0.64s 29.77% go-learning/src/ch41.easyjson6a975c40DecodeGoLearningSrcCh411
  13. 0 0% 4.19% 0.51s 23.72% runtime.systemstack
  14. 0 0% 4.19% 0.38s 17.67% encoding/json.Unmarshal

list processRequest

  1. Total: 2.15s
  2. ROUTINE ======================== go-learning/src/ch41.processRequest in D:\JetbrainsCode\Learn_Go\go-learning\src\ch41\optmization.go
  3. 10ms 710ms (flat, cum) 33.02% of Total
  4. . . 21:
  5. . . 22:func processRequest(reqs []string) []string {
  6. . . 23: reps := []string{}
  7. . . 24: for _, req := range reqs {
  8. . . 25: reqObj := &Request{}
  9. . 470ms 26: reqObj.UnmarshalJSON([]byte(req))
  10. . . 27: // json.Unmarshal([]byte(req), reqObj)
  11. . . 28:
  12. . . 29: var buf strings.Builder
  13. 10ms 10ms 30: for _, e := range reqObj.PayLoad {
  14. . 100ms 31: buf.WriteString(strconv.Itoa(e))
  15. . 10ms 32: buf.WriteString(",")
  16. . . 33: }
  17. . . 34: repObj := &Response{reqObj.TransactionID, buf.String()}
  18. . 100ms 35: repJson, err := repObj.MarshalJSON()
  19. . . 36: //repJson, err := json.Marshal(&repObj)
  20. . . 37: if err != nil {
  21. . . 38: panic(err)
  22. . . 39: }
  23. . 20ms 40: reps = append(reps, string(repJson))
  24. . . 41: }
  25. . . 42: return reps
  26. . . 43:}
  27. . . 44:
  28. . . 45:func processRequestOld(reqs []string) []string {
  29. . 380ms 49: json.Unmarshal([]byte(req), reqObj)
  30. . . 50: ret := ""
  31. . . 51: for _, e := range reqObj.PayLoad {
  32. . 370ms 52: ret += strconv.Itoa(e) + ","
  33. . . 53: }
  34. . . 54: repObj := &Response{reqObj.TransactionID, ret}
  35. . 130ms 55: repJson, err := json.Marshal(&repObj)
  36. . . 56: if err != nil {
  37. . . 57: panic(err)
  38. . . 58: }
  39. . . 59: reps = append(reps, string(repJson))
  40. . . 60: }

easyjson -all structs.go go test -bench=”.” -memprofile=”mem.prof”

image.png
image.png

Wall Time 挂钟时间, 你的程序运行的绝对时间和某个函数运行的绝对时间,其中有可能有阻塞。调用外部系统,等待外部响应,这些会影响你的绝对时间 CPU Time CPU消耗时间,包括内存分配,GC次数,GC的耗时
Block Time Memory allocation GC times/time spent

image.png
image.png

3. 别让性能被锁住

image.png
image.png
image.png

Disruptor

4. GC友好的代码

image.png
image.png

GODEBUG=gctrace=1 go test -bench=”BenchmarkPassingArrayWithValue” 无法使用 go test -bench=”BenchmarkPassingArrayWithValue” “-trace=trace_val.out”

  1. goos: windows
  2. goarch: amd64
  3. pkg: go-learning/src/ch43/gc_friendly/passing_ref
  4. cpu: Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
  5. BenchmarkPassingArrayWithValue-8 85 12794193 ns/op
  6. PASS
  7. ok go-learning/src/ch43/gc_friendly/passing_ref 1.216s

go test -bench=”BenchmarkPassingArrayWithValue” “-trace=trace_ref.out”

  1. goos: windows
  2. goarch: amd64
  3. pkg: go-learning/src/ch43/gc_friendly/passing_ref
  4. cpu: Intel(R) Core(TM) i5-8265U CPU @ 1.60GHz
  5. BenchmarkPassingArrayWithValue-8 85 13425085 ns/op
  6. PASS
  7. ok go-learning/src/ch43/gc_friendly/passing_ref 1.311s

go tool trace .\trace_ref.out

  1. 2022/08/11 12:24:34 Parsing trace...
  2. 2022/08/11 12:24:34 Splitting trace...
  3. 2022/08/11 12:24:34 Opening browser. Trace viewer is listening on http://127.0.0.1:7042
  4. 此时会跳出一个网页页面

image.png
image.png
image.png
image.png