ADR 030: Consensus Refactor

Context

One of the biggest challenges this project faces is to proof that the implementations of the specifications are correct, much like we strive to formaly verify our alogrithms and protocols we should work towards high confidence about the correctness of our program code. One of those is the core of Tendermint - Consensus - which currently resides in the consensus package. Over time there has been high friction making changes to the package due to the algorithm being scattered in a side-effectful container (the current ConsensusState). In order to test the algorithm a large object-graph needs to be set up and even than the non-deterministic parts of the container makes will prevent high certainty. Where ideally we have a 1-to-1 representation of the spec, ready and easy to test for domain experts.

Addresses:

Decision

To remedy these issues we plan a gradual, non-invasive refactoring of the consensus package. Starting of by isolating the consensus alogrithm into a pure function and a finite state machine to address the most pressuring issue of lack of confidence. Doing so while leaving the rest of the package in tact and have follow-up optional changes to improve the sepration of concerns.

Implementation changes

The core of Consensus can be modelled as a function with clear defined inputs:

  • State - data container for current round, height, etc.
  • Event- significant events in the network

producing clear outputs;

  • State - updated input
  • Message - signal what actions to perform
  1. type Event int
  2. const (
  3. EventUnknown Event = iota
  4. EventProposal
  5. Majority23PrevotesBlock
  6. Majority23PrecommitBlock
  7. Majority23PrevotesAny
  8. Majority23PrecommitAny
  9. TimeoutNewRound
  10. TimeoutPropose
  11. TimeoutPrevotes
  12. TimeoutPrecommit
  13. )
  14. type Message int
  15. const (
  16. MeesageUnknown Message = iota
  17. MessageProposal
  18. MessageVotes
  19. MessageDecision
  20. )
  21. type State struct {
  22. height uint64
  23. round uint64
  24. step uint64
  25. lockedValue interface{} // TODO: Define proper type.
  26. lockedRound interface{} // TODO: Define proper type.
  27. validValue interface{} // TODO: Define proper type.
  28. validRound interface{} // TODO: Define proper type.
  29. // From the original notes: valid(v)
  30. valid interface{} // TODO: Define proper type.
  31. // From the original notes: proposer(h, r)
  32. proposer interface{} // TODO: Define proper type.
  33. }
  34. func Consensus(Event, State) (State, Message) {
  35. // Consolidate implementation.
  36. }

Tracking of relevant information to feed Event into the function and act on the output is left to the ConsensusExecutor (formerly ConsensusState).

Benefits for testing surfacing nicely as testing for a sequence of events against algorithm could be as simple as the following example:

  1. func TestConsensusXXX(t *testing.T) {
  2. type expected struct {
  3. message Message
  4. state State
  5. }
  6. // Setup order of events, initial state and expectation.
  7. var (
  8. events = []struct {
  9. event Event
  10. want expected
  11. }{
  12. // ...
  13. }
  14. state = State{
  15. // ...
  16. }
  17. )
  18. for _, e := range events {
  19. sate, msg = Consensus(e.event, state)
  20. // Test message expectation.
  21. if msg != e.want.message {
  22. t.Fatalf("have %v, want %v", msg, e.want.message)
  23. }
  24. // Test state expectation.
  25. if !reflect.DeepEqual(state, e.want.state) {
  26. t.Fatalf("have %v, want %v", state, e.want.state)
  27. }
  28. }
  29. }

Consensus Executor

Consensus Core

  1. type Event interface{}
  2. type EventNewHeight struct {
  3. Height int64
  4. ValidatorId int
  5. }
  6. type EventNewRound HeightAndRound
  7. type EventProposal struct {
  8. Height int64
  9. Round int
  10. Timestamp Time
  11. BlockID BlockID
  12. POLRound int
  13. Sender int
  14. }
  15. type Majority23PrevotesBlock struct {
  16. Height int64
  17. Round int
  18. BlockID BlockID
  19. }
  20. type Majority23PrecommitBlock struct {
  21. Height int64
  22. Round int
  23. BlockID BlockID
  24. }
  25. type HeightAndRound struct {
  26. Height int64
  27. Round int
  28. }
  29. type Majority23PrevotesAny HeightAndRound
  30. type Majority23PrecommitAny HeightAndRound
  31. type TimeoutPropose HeightAndRound
  32. type TimeoutPrevotes HeightAndRound
  33. type TimeoutPrecommit HeightAndRound
  34. type Message interface{}
  35. type MessageProposal struct {
  36. Height int64
  37. Round int
  38. BlockID BlockID
  39. POLRound int
  40. }
  41. type VoteType int
  42. const (
  43. VoteTypeUnknown VoteType = iota
  44. Prevote
  45. Precommit
  46. )
  47. type MessageVote struct {
  48. Height int64
  49. Round int
  50. BlockID BlockID
  51. Type VoteType
  52. }
  53. type MessageDecision struct {
  54. Height int64
  55. Round int
  56. BlockID BlockID
  57. }
  58. type TriggerTimeout struct {
  59. Height int64
  60. Round int
  61. Duration Duration
  62. }
  63. type RoundStep int
  64. const (
  65. RoundStepUnknown RoundStep = iota
  66. RoundStepPropose
  67. RoundStepPrevote
  68. RoundStepPrecommit
  69. RoundStepCommit
  70. )
  71. type State struct {
  72. Height int64
  73. Round int
  74. Step RoundStep
  75. LockedValue BlockID
  76. LockedRound int
  77. ValidValue BlockID
  78. ValidRound int
  79. ValidatorId int
  80. ValidatorSetSize int
  81. }
  82. func proposer(height int64, round int) int {}
  83. func getValue() BlockID {}
  84. func Consensus(event Event, state State) (State, Message, TriggerTimeout) {
  85. msg = nil
  86. timeout = nil
  87. switch event := event.(type) {
  88. case EventNewHeight:
  89. if event.Height > state.Height {
  90. state.Height = event.Height
  91. state.Round = -1
  92. state.Step = RoundStepPropose
  93. state.LockedValue = nil
  94. state.LockedRound = -1
  95. state.ValidValue = nil
  96. state.ValidRound = -1
  97. state.ValidatorId = event.ValidatorId
  98. }
  99. return state, msg, timeout
  100. case EventNewRound:
  101. if event.Height == state.Height and event.Round > state.Round {
  102. state.Round = eventRound
  103. state.Step = RoundStepPropose
  104. if proposer(state.Height, state.Round) == state.ValidatorId {
  105. proposal = state.ValidValue
  106. if proposal == nil {
  107. proposal = getValue()
  108. }
  109. msg = MessageProposal { state.Height, state.Round, proposal, state.ValidRound }
  110. }
  111. timeout = TriggerTimeout { state.Height, state.Round, timeoutPropose(state.Round) }
  112. }
  113. return state, msg, timeout
  114. case EventProposal:
  115. if event.Height == state.Height and event.Round == state.Round and
  116. event.Sender == proposal(state.Height, state.Round) and state.Step == RoundStepPropose {
  117. if event.POLRound >= state.LockedRound or event.BlockID == state.BlockID or state.LockedRound == -1 {
  118. msg = MessageVote { state.Height, state.Round, event.BlockID, Prevote }
  119. }
  120. state.Step = RoundStepPrevote
  121. }
  122. return state, msg, timeout
  123. case TimeoutPropose:
  124. if event.Height == state.Height and event.Round == state.Round and state.Step == RoundStepPropose {
  125. msg = MessageVote { state.Height, state.Round, nil, Prevote }
  126. state.Step = RoundStepPrevote
  127. }
  128. return state, msg, timeout
  129. case Majority23PrevotesBlock:
  130. if event.Height == state.Height and event.Round == state.Round and state.Step >= RoundStepPrevote and event.Round > state.ValidRound {
  131. state.ValidRound = event.Round
  132. state.ValidValue = event.BlockID
  133. if state.Step == RoundStepPrevote {
  134. state.LockedRound = event.Round
  135. state.LockedValue = event.BlockID
  136. msg = MessageVote { state.Height, state.Round, event.BlockID, Precommit }
  137. state.Step = RoundStepPrecommit
  138. }
  139. }
  140. return state, msg, timeout
  141. case Majority23PrevotesAny:
  142. if event.Height == state.Height and event.Round == state.Round and state.Step == RoundStepPrevote {
  143. timeout = TriggerTimeout { state.Height, state.Round, timeoutPrevote(state.Round) }
  144. }
  145. return state, msg, timeout
  146. case TimeoutPrevote:
  147. if event.Height == state.Height and event.Round == state.Round and state.Step == RoundStepPrevote {
  148. msg = MessageVote { state.Height, state.Round, nil, Precommit }
  149. state.Step = RoundStepPrecommit
  150. }
  151. return state, msg, timeout
  152. case Majority23PrecommitBlock:
  153. if event.Height == state.Height {
  154. state.Step = RoundStepCommit
  155. state.LockedValue = event.BlockID
  156. }
  157. return state, msg, timeout
  158. case Majority23PrecommitAny:
  159. if event.Height == state.Height and event.Round == state.Round {
  160. timeout = TriggerTimeout { state.Height, state.Round, timeoutPrecommit(state.Round) }
  161. }
  162. return state, msg, timeout
  163. case TimeoutPrecommit:
  164. if event.Height == state.Height and event.Round == state.Round {
  165. state.Round = state.Round + 1
  166. }
  167. return state, msg, timeout
  168. }
  169. }
  170. func ConsensusExecutor() {
  171. proposal = nil
  172. votes = HeightVoteSet { Height: 1 }
  173. state = State {
  174. Height: 1
  175. Round: 0
  176. Step: RoundStepPropose
  177. LockedValue: nil
  178. LockedRound: -1
  179. ValidValue: nil
  180. ValidRound: -1
  181. }
  182. event = EventNewHeight {1, id}
  183. state, msg, timeout = Consensus(event, state)
  184. event = EventNewRound {state.Height, 0}
  185. state, msg, timeout = Consensus(event, state)
  186. if msg != nil {
  187. send msg
  188. }
  189. if timeout != nil {
  190. trigger timeout
  191. }
  192. for {
  193. select {
  194. case message := <- msgCh:
  195. switch msg := message.(type) {
  196. case MessageProposal:
  197. case MessageVote:
  198. if msg.Height == state.Height {
  199. newVote = votes.AddVote(msg)
  200. if newVote {
  201. switch msg.Type {
  202. case Prevote:
  203. prevotes = votes.Prevotes(msg.Round)
  204. if prevotes.WeakCertificate() and msg.Round > state.Round {
  205. event = EventNewRound { msg.Height, msg.Round }
  206. state, msg, timeout = Consensus(event, state)
  207. state = handleStateChange(state, msg, timeout)
  208. }
  209. if blockID, ok = prevotes.TwoThirdsMajority(); ok and blockID != nil {
  210. if msg.Round == state.Round and hasBlock(blockID) {
  211. event = Majority23PrevotesBlock { msg.Height, msg.Round, blockID }
  212. state, msg, timeout = Consensus(event, state)
  213. state = handleStateChange(state, msg, timeout)
  214. }
  215. if proposal != nil and proposal.POLRound == msg.Round and hasBlock(blockID) {
  216. event = EventProposal {
  217. Height: state.Height
  218. Round: state.Round
  219. BlockID: blockID
  220. POLRound: proposal.POLRound
  221. Sender: message.Sender
  222. }
  223. state, msg, timeout = Consensus(event, state)
  224. state = handleStateChange(state, msg, timeout)
  225. }
  226. }
  227. if prevotes.HasTwoThirdsAny() and msg.Round == state.Round {
  228. event = Majority23PrevotesAny { msg.Height, msg.Round, blockID }
  229. state, msg, timeout = Consensus(event, state)
  230. state = handleStateChange(state, msg, timeout)
  231. }
  232. case Precommit:
  233. }
  234. }
  235. }
  236. case timeout := <- timeoutCh:
  237. case block := <- blockCh:
  238. }
  239. }
  240. }
  241. func handleStateChange(state, msg, timeout) State {
  242. if state.Step == Commit {
  243. state = ExecuteBlock(state.LockedValue)
  244. }
  245. if msg != nil {
  246. send msg
  247. }
  248. if timeout != nil {
  249. trigger timeout
  250. }
  251. }

Implementation roadmap

  • implement proposed implementation
  • replace currently scattered calls in ConsensusState with calls to the new Consensus function
  • rename ConsensusState to ConsensusExecutor to avoid confusion
  • propose design for improved separation and clear information flow between ConsensusExecutor and ConsensusReactor

Status

Draft.

Consequences

Positive

  • isolated implementation of the algorithm
  • improved testability - simpler to proof correctness
  • clearer separation of concerns - easier to reason

Negative

Neutral