缘起

最近阅读<<我的第一本算法书>>(【日】石田保辉;宫崎修一)
本系列笔记拟采用golang练习之

快速排序(Quick Sort)

  1. 快速排序算法首先会在序列中随机选择一个基准值(pivot),
  2. 然后将除了基准值以外的数,
  3. 分为“比基准值小的数”和“比基准值大的数”这两个类别,
  4. 再将其排列成以下形式:
  5. [ 比基准值小的数 ] 基准值 [ 比基准值大的数 ]
  6. 接着,对两个“[]”中的数据进行排序之后,
  7. 整体的排序便完成了。
  8. 对“[]”里面的数据进行排序时同样也会使用快速排序。
  9. 快速排序是一种“分治法”。
  10. 它将原本的问题分成两个子问题(比基准值小的数和比基准值大的数),
  11. 然后再分别解决这两个问题(递归地)。
  12. 平均运行时间为O(nlogn)
  13. 摘自 <<我的第一本算法书>> 【日】石田保辉;宫崎修一

流程(非递归, 原地快速排序)

  1. 给定待排序数组data[N]
  2. 定义待排序栈stack, 其中元素是一个(left, right)整型坐标, 表示待排序子序列的范围
  3. 初始时, 将(0, N-1)压入stack, 表示需要将整个序列进行排序
  4. 当stack不为空时, 循环执行:
    1. 待排序子序列出栈: left, right = stack.pop()
    2. 取基准值v = data[left], 然后data[left]置为nil, 腾出一个格子备用
    3. 取左指针l = left, 右指针r = right, 当前指针标识(左/右)rside = true
    4. 如果rside == true, 将右指针r向左移动, 直到: data[r] < v, 或r=l
    5. 如果找到data[r] < v, 则把data[r]置入data[l]指向的空位, data[r]设nil, 腾出一个格子
    6. 如果rside == false, 将左指针l向右移动, 直到: data[l] > v, 或l=r
    7. 如果找到data[l] > v, 则把data[l]置入data[r]指向的空位, data[l]设nil, 腾出一个格子
    8. 如果l == r, 左右序列切分完成, 将基准值v置入data[l], 返回
    9. 循环执行步骤4-8
  5. stack为空, 排序完成

为什么要非递归

  • 极端情况下(比如特别大的数组, 刚好已经是倒序排列, 而每次取基准值是取left位置), 递归算法可能导致栈嵌套过深, 一个是占用大量内存, 二个是可能导致栈溢出错误.
  • 快速排序需要左右子序列的中间结果, 再进行合并, 因此无法通过尾递归优化消除栈嵌套

    目标

  • 实现并验证快速排序

  • 使用辅助的子序列坐标栈, 实现非递归执行
  • 原地排序, 不占用额外空间

设计

  • ISorter: 定义排序器接口. 定义值比较函数以兼容任意数值类型, 通过调整比较函数实现倒序排序
  • tQStack: 实现一个堆栈, 辅助快速排序时, 记录待排序的子序列坐标
  • tQuickSort: 非递归的原地快速排序器, 实现ISorter接口, 使用辅助栈消除递归

单元测试

quick_sort_test.go, 测试过程与堆排序, 归并排序类似, 样本规模为10万元素

  1. package sorting
  2. import (
  3. "fmt"
  4. "learning/gooop/sorting"
  5. "learning/gooop/sorting/quick_sort"
  6. "math/rand"
  7. "testing"
  8. "time"
  9. )
  10. func Test_QuickSort(t *testing.T) {
  11. fnAssertTrue := func(b bool, msg string) {
  12. if !b {
  13. t.Fatal(msg)
  14. }
  15. }
  16. reversed := false
  17. fnCompare := func(a interface{}, b interface{}) sorting.CompareResult {
  18. i1 := a.(int)
  19. i2 := b.(int)
  20. if i1 < i2 {
  21. if reversed {
  22. return sorting.GREATER
  23. } else {
  24. return sorting.LESS
  25. }
  26. } else if i1 == i2 {
  27. return sorting.EQUAL
  28. } else {
  29. if reversed {
  30. return sorting.LESS
  31. } else {
  32. return sorting.GREATER
  33. }
  34. }
  35. }
  36. fnTestSorter := func(sorter sorting.ISorter) {
  37. reversed = false
  38. // test simple array
  39. samples := []interface{} { 2,3,1,5,4,7,6 }
  40. samples = sorter.Sort(samples, fnCompare)
  41. fnAssertTrue(fmt.Sprintf("%v", samples) == "[1 2 3 4 5 6 7]", "expecting 1,2,3,4,5,6,7")
  42. t.Log("pass sorting [2 3 1 5 4 7 6] >> [1 2 3 4 5 6 7]")
  43. // test 10000 items sorting
  44. rnd := rand.New(rand.NewSource(time.Now().UnixNano()))
  45. for plus := 0;plus < 5;plus++ {
  46. sampleCount := 100 * 1000 + plus
  47. t.Logf("prepare large array with %v items", sampleCount)
  48. samples = make([]interface{}, sampleCount)
  49. for i := 0; i < sampleCount; i++ {
  50. samples[i] = rnd.Intn(sampleCount * 10)
  51. }
  52. t.Logf("sorting large array with %v items", sampleCount)
  53. t0 := time.Now().UnixNano()
  54. samples = sorter.Sort(samples, fnCompare)
  55. cost := time.Now().UnixNano() - t0
  56. for i := 1; i < sampleCount; i++ {
  57. fnAssertTrue(fnCompare(samples[i-1], samples[i]) != sorting.GREATER, "expecting <=")
  58. }
  59. t.Logf("end sorting large array, cost = %v ms", cost/1000000)
  60. }
  61. // test 0-20
  62. sampleCount := 20
  63. t.Log("sorting 0-20")
  64. samples = make([]interface{}, sampleCount)
  65. for i := 0;i < sampleCount;i++ {
  66. for {
  67. p := rnd.Intn(sampleCount)
  68. if samples[p] == nil {
  69. samples[p] = i
  70. break
  71. }
  72. }
  73. }
  74. t.Logf("unsort = %v", samples)
  75. samples = sorter.Sort(samples, fnCompare)
  76. t.Logf("sorted = %v", samples)
  77. fnAssertTrue(fmt.Sprintf("%v", samples) == "[0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]", "expecting 0-20")
  78. t.Log("pass sorting 0-20")
  79. // test special
  80. samples = []interface{} {}
  81. samples = sorter.Sort(samples, fnCompare)
  82. t.Log("pass sorting []")
  83. samples = []interface{} { 1 }
  84. samples = sorter.Sort(samples, fnCompare)
  85. t.Log("pass sorting [1]")
  86. samples = []interface{} { 3,1 }
  87. samples = sorter.Sort(samples, fnCompare)
  88. fnAssertTrue(fmt.Sprintf("%v", samples) == "[1 3]", "expecting 1,3")
  89. t.Log("pass sorting [1 3]")
  90. reversed = true
  91. samples = []interface{} { 2, 3,1 }
  92. samples = sorter.Sort(samples, fnCompare)
  93. fnAssertTrue(fmt.Sprintf("%v", samples) == "[3 2 1]", "expecting 3,2,1")
  94. t.Log("pass sorting [3 2 1]")
  95. }
  96. t.Log("\ntesting QuickSorter")
  97. fnTestSorter(quick_sort.QuickSorter)
  98. }

测试输出

  • 快速排序真的很快, 与堆排序,归并排序是一个数量级, 10万随机元素排序耗时仅数十毫秒
  • 对随机数据的排序比归并排序还稍快一些, 这可能是因为原地排序不需要预分配缓冲区
    1. $ go test -v quick_sort_test.go
    2. === RUN Test_QuickSort
    3. quick_sort_test.go:111:
    4. testing QuickSorter
    5. quick_sort_test.go:48: pass sorting [2 3 1 5 4 7 6] >> [1 2 3 4 5 6 7]
    6. quick_sort_test.go:54: prepare large array with 100000 items
    7. quick_sort_test.go:60: sorting large array with 100000 items
    8. quick_sort_test.go:67: end sorting large array, cost = 27 ms
    9. quick_sort_test.go:54: prepare large array with 100001 items
    10. quick_sort_test.go:60: sorting large array with 100001 items
    11. quick_sort_test.go:67: end sorting large array, cost = 28 ms
    12. quick_sort_test.go:54: prepare large array with 100002 items
    13. quick_sort_test.go:60: sorting large array with 100002 items
    14. quick_sort_test.go:67: end sorting large array, cost = 33 ms
    15. quick_sort_test.go:54: prepare large array with 100003 items
    16. quick_sort_test.go:60: sorting large array with 100003 items
    17. quick_sort_test.go:67: end sorting large array, cost = 32 ms
    18. quick_sort_test.go:54: prepare large array with 100004 items
    19. quick_sort_test.go:60: sorting large array with 100004 items
    20. quick_sort_test.go:67: end sorting large array, cost = 27 ms
    21. quick_sort_test.go:72: sorting 0-20
    22. quick_sort_test.go:83: unsort = [11 3 4 2 9 19 18 7 12 6 13 5 10 0 15 14 17 1 8 16]
    23. quick_sort_test.go:86: sorted = [0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19]
    24. quick_sort_test.go:88: pass sorting 0-20
    25. quick_sort_test.go:93: pass sorting []
    26. quick_sort_test.go:97: pass sorting [1]
    27. quick_sort_test.go:102: pass sorting [1 3]
    28. quick_sort_test.go:108: pass sorting [3 2 1]
    29. --- PASS: Test_QuickSort (0.18s)
    30. PASS
    31. ok command-line-arguments 0.184s

ISorter.go

定义排序器接口. 定义值比较函数以兼容任意数值类型, 通过调整比较函数实现倒序排序

  1. package sorting
  2. type ISorter interface {
  3. Sort(data []interface{}, comparator CompareFunction) []interface{}
  4. }
  5. type CompareFunction func(a interface{}, b interface{}) CompareResult
  6. type CompareResult int
  7. const LESS CompareResult = -1
  8. const EQUAL CompareResult = 0
  9. const GREATER CompareResult = 1

tQStack.go

实现一个堆栈, 辅助快速排序时, 记录待排序的子序列坐标

  1. package quick_sort
  2. type tQStack struct {
  3. stack []tIntPair
  4. capacity int
  5. size int
  6. }
  7. type tIntPair [2]int
  8. var gEmptyPair = [2]int{ -1, -1 }
  9. func newQStack() *tQStack {
  10. return &tQStack{
  11. stack: make([]tIntPair, 0),
  12. capacity: 0,
  13. size: 0,
  14. }
  15. }
  16. func (me *tQStack) push(left,right int) {
  17. node := tIntPair([2]int{left,right})
  18. if me.size < me.capacity {
  19. me.stack[me.size] = node
  20. } else {
  21. me.stack = append(me.stack, node)
  22. me.capacity++
  23. }
  24. me.size++
  25. }
  26. func (me *tQStack) pop() (left, right int) {
  27. me.size--
  28. it := me.stack[me.size]
  29. me.stack[me.size] = gEmptyPair
  30. return it[0], it[1]
  31. }
  32. func (me *tQStack) isEmpty() bool {
  33. return me.size <= 0
  34. }
  35. func (me *tQStack) isNotEmpty() bool {
  36. return me.size > 0
  37. }

tQuickSort.go

非递归的原地快速排序器, 实现ISorter接口, 使用辅助栈消除递归

  1. package quick_sort
  2. import (
  3. "learning/gooop/sorting"
  4. )
  5. type tQuickSort struct {
  6. }
  7. func newQuickSort() sorting.ISorter {
  8. return &tQuickSort{}
  9. }
  10. func (me *tQuickSort) Sort(data []interface{}, comparator sorting.CompareFunction) []interface{} {
  11. if data == nil {
  12. return nil
  13. }
  14. size := len(data)
  15. if size <= 1 {
  16. return data
  17. }
  18. if size == 2 {
  19. if comparator(data[0], data[1]) == sorting.GREATER {
  20. data[0],data[1] = data[1], data[0]
  21. return data
  22. }
  23. }
  24. stack := newQStack()
  25. stack.push(0, size - 1)
  26. me.qsort(data, comparator, stack)
  27. return data
  28. }
  29. func (me *tQuickSort) qsort(data []interface{}, comparator sorting.CompareFunction, stack *tQStack) {
  30. for ;stack.isNotEmpty(); {
  31. left, right := stack.pop()
  32. lfrom, lto, rfrom, rto := me.split(data, comparator, left, right)
  33. if lfrom < lto {
  34. stack.push(lfrom, lto)
  35. }
  36. if rfrom < rto {
  37. stack.push(rfrom, rto)
  38. }
  39. }
  40. }
  41. func (me *tQuickSort) split(data []interface{}, comparator sorting.CompareFunction, left int, right int) (lfrom, lto, rfrom, rto int) {
  42. if left >= right {
  43. return
  44. }
  45. v := data[left]
  46. l := left
  47. r := right
  48. rside := true
  49. for {
  50. hit := false
  51. if rside {
  52. for ; r > l; r-- {
  53. if comparator(data[r], v) == sorting.LESS {
  54. hit = true
  55. break
  56. }
  57. }
  58. if hit {
  59. data[l], data[r] = data[r], nil
  60. l++
  61. rside = false
  62. }
  63. } else {
  64. for ; l < r;l++ {
  65. if comparator(data[l], v) == sorting.GREATER {
  66. hit = true
  67. break
  68. }
  69. }
  70. if hit {
  71. data[r], data[l] = data[l], nil
  72. r--
  73. rside = true
  74. }
  75. }
  76. if l == r {
  77. data[l] = v
  78. break
  79. }
  80. }
  81. return left, l - 1, r + 1, right
  82. }
  83. var QuickSorter = newQuickSort()

(end)