/ doc / src / arch / consensus.md
consensus.md
  1  # Consensus
  2  
  3  This section of the book describes how nodes participating in the DarkFi
  4  blockchain achieve consensus.
  5  
  6  ## Glossary
  7  
  8  | Name                   | Description                                                                            |
  9  |------------------------|----------------------------------------------------------------------------------------|
 10  | Consensus              | Algorithm for reaching blockchain consensus between participating nodes                |
 11  | Node/Validator         | DarkFi daemon participating in the network                                             |
 12  | Miner                  | Block producer                                                                         |
 13  | Unproposed Transaction | Transaction that exists in the memory pool but has not yet been included in a proposal |
 14  | Block proposal         | Block that has not yet been appended onto the canonical blockchain                     |
 15  | P2P network            | Peer-to-peer network on which nodes communicate with each other                        |
 16  | Confirmation           | State achieved when a block and its contents are appended to the canonical blockchain  |
 17  | Fork                   | Chain of block proposals that begins with the last block of the canonical blockchain   |
 18  | MAX_INT                | The maximum 32 bytes (256 bits) integer 2^256 − 1                                      |
 19  
 20  ## Miner main loop
 21  
 22  DarkFi uses RandomX Proof of Work algorithm.
 23  Therefore, block production involves the following steps:
 24  
 25  * First, a miner grabs its current best ranking fork and extends it
 26    with a block composed of unproposed transactions from the miner's
 27    mempool.
 28  
 29  * Then the miner tries to find a nonce such that when the block header
 30    is hashed its bytes produce a number that is less than the current
 31    difficulty target of the network, using the [RandomX mining algorithm](https://github.com/tevador/RandomX).
 32  
 33  * Once the miner finds such a nonce, it broadcasts its block proposal
 34    to the P2P network. Finally the miner triggers a confirmation check
 35    to see if its newly extended fork can be confirmed.
 36  
 37  Pseudocode:
 38  ```
 39  loop {
 40      fork = find_best_fork()
 41  
 42      block = generate_next_block(fork)
 43  
 44      mine_block(block)
 45  
 46      p2p.broadcast_proposal(block)
 47  
 48      fork.append_proposal(block)
 49  
 50      chain_confirmation()
 51  }
 52  ```
 53  
 54  ## Listening for block proposals
 55  
 56  Each node listens for new block proposals on the P2P network. Upon
 57  receiving block proposals, nodes try to extend the proposals onto a
 58  fork held in memory (this process is described in the next section).
 59  Then nodes trigger a confirmation check to see if their newly extended
 60  fork can be confirmed.
 61  
 62  Upon receiving a new block proposal, miners also check if the extended
 63  fork rank is better than the one they are currently trying to extend.
 64  If the fork rank is better, the miner will stop mining its proposal and
 65  start mining the new best fork.
 66  
 67  ## Ranking
 68  
 69  Each block proposal is ranked based on how hard it is to produce. To
 70  measure that, we compute the squared distance of its height target from
 71  `MAX_INT`. For two honest nodes that mine the next block height of the
 72  highest ranking fork, their block will have the same rank. To mitigate
 73  this tie scenario, we also compute the squared distance of the blocks
 74  `RandomX` hash from `MAX_INT`, allowing us to always choose the actual
 75  higher ranking block for that height, in case of ties. The complete
 76  block rank is a tuple containing both squared distances.
 77  
 78  Proof of Work algorithm lowers the difficulty target as hashpower grows.
 79  This means that blocks will have to be mined for a lower target,
 80  therefore rank higher, as they go further away from `MAX_INT`.
 81  
 82  Similar to blocks, blockchain/forks rank is a tuple, with the first
 83  part being the sum of its block's squared target distances, and the
 84  second being the sum of their squared hash distances. Squared distances
 85  are used to disproportionately favors smaller targets, with the idea
 86  being that it will be harder to trigger a longer reorg between forks.
 87  When we compare forks, we first check the first sum, and if it's tied,
 88  we use the second as the tie breaker, since we know it will be
 89  statistically unique for each sequence.
 90  
 91  The ranking of a fork is always increasing as new blocks are appended.
 92  To see this, let $F = (M₁ ⋯  Mₙ)$ be a fork with a finite sequence of
 93  blocks $(Mᵢ)$ of length $n$. The rank of a fork is calculated as
 94  $$ r_F = n ∑ᵢ₌₁ⁿ \t{rank}(Mᵢ) $$
 95  Let $F' = F ⊕  (Mₙ₊₁)$ of length $n + 1$ be the fork created by
 96  appending the block $Mₙ₊₁$ to $F$. Then we see that
 97  $$ r_{F'} > r_F $$
 98  since $\t{rank}(M) > 0$ for all $M$.
 99  
100  ## Fork extension
101  
102  Since there can be more than one block producer, each node holds a set
103  of known forks in memory. Nodes extend the best ranking fork in memory
104  when producing a block.
105  
106  Upon receiving a block, one of the following cases may occur:
107  
108  | Description                                   | Handling                                                            |
109  |-----------------------------------------------|---------------------------------------------------------------------|
110  | Block extends a known fork at its end         | Append block to fork                                                |
111  | Block extends a known fork not at its end     | Create a new fork up to the extended block and append the new block |
112  | Block extends canonical blockchain at its end | Create a new fork containing the new block                          |
113  | Block doesn't extend any known chain          | Check if a reorg should be executed                                 |
114  
115  ### Visual Examples
116  
117  | Symbol        | Description                            |
118  |---------------|----------------------------------------|
119  | [C]           | Canonical (confirmed) blockchain block |
120  | [C]--...--[C] | Sequence of canonical blocks           |
121  | [Mn]          | Proposal produced by Miner n           |
122  | Fn            | Fork name to identify them in examples |
123  | +--           | Appending a block to fork              |
124  | /--           | Dropped fork                           |
125  
126  Starting state:
127  
128                     |--[M0] <-- F0
129      [C]--...--[C]--|
130                     |--[M1] <-- F1
131  
132  Blocks on same Y axis have the same height.
133  
134  #### Case 1
135  
136  Extending F0 fork with a new block proposal:
137  
138                     |--[M0]+--[M2] <-- F0
139      [C]--...--[C]--|
140                     |--[M1]        <-- F1
141  
142  #### Case 2
143  
144  Extending F0 fork at [M0] block with a new block proposal, creating a
145  new fork chain:
146  
147                     |--[M0]--[M2]   <-- F0
148      [C]--...--[C]--|
149                     |--[M1]         <-- F1
150                     |
151                     |+--[M0]+--[M3] <-- F2
152  
153  ##### Case 3
154  
155  Extending the canonical blockchain with a new block proposal:
156  
157                     |--[M0]--[M2] <-- F0
158      [C]--...--[C]--|
159                     |--[M1]       <-- F1
160                     |
161                     |--[M0]--[M3] <-- F2
162                     |
163                     |+--[M4]      <-- F3
164  
165  ##### Case 4
166  
167  Reorg happened and we rebuild the chain:
168  
169                                |--[M0]--[M2] <-- F0
170      [C]--...--[C]/--...--[C]--|
171                 |              |--[M1]       <-- F1
172                 |              |
173                 |              |--[M0]--[M3] <-- F2
174                 |              |
175                 |              |--[M4]       <-- F3
176                 |
177                 ...--...--[C]--|+--[M5]      <-- F4
178  
179  
180  ## Confirmation
181  
182  Based on the rank properties, each node will diverge to the highest
183  ranking fork, and new fork will emerge extending that at its tips.
184  A security threshold is set, which refers to the height where the
185  probability to produce a fork, able to reorg the current best ranking
186  fork reaches zero, similar to the # of block confirmation used by other
187  PoW based protocols.
188  
189  When the confirmation check kicks in, each node will grab its best fork.
190  If the fork's length exceeds the security threshold, the node will push
191  (confirm) its first proposal to the canonical blockchain. The fork acts
192  as a queue (buffer) for the to-be-confirmed proposals.
193  
194  Once a confirmation occurs, all the fork chains not starting with the
195  confirmed block(s) are removed from the node's memory pool.
196  
197  We continue Case 3 from the previous section to visualize this logic.
198  
199  The confirmation threshold used in this example is 3 blocks. A node
200  observes 2 proposals. One extends the F0 fork and the other extends
201  the F2 fork:
202  
203                     |--[M0]--[M2]+--[M5] <-- F0
204      [C]--...--[C]--|
205                     |--[M1]              <-- F1
206                     |
207                     |--[M0]--[M3]+--[M6] <-- F2
208                     |
209                     |--[M4]              <-- F3
210  
211  
212  Later, the node only observes 1 proposal, extending the F2 fork:
213  
214                     |--[M0]--[M2]--[M5]        <-- F0
215      [C]--...--[C]--|
216                     |--[M1]                    <-- F1
217                     |
218                     |--[M0]--[M3]--[M6]+--[M7] <-- F2
219                     |
220                     |--[M4]                    <-- F3
221  
222  When the confirmation sync period starts, the node confirms block M0 and
223  keeps the forks that extend that:
224  
225                     |--[M0]--[M2]--[M5]       <-- F0
226      [C]--...--[C]--|
227                     |/--[M1]                  <-- F1
228                     |
229                     |--[M0]--[M3]--[M6]--[M7] <-- F2
230                     |
231                     |/--[M4]                  <-- F3
232  
233  The canonical blockchain now contains blocks M0 and the current state is:
234  
235                     |--[M2]--[M5]       <-- F0
236      [C]--...--[C]--|
237                     |--[M3]--[M6]--[M7] <-- F2
238  
239  # Appendix: Data Structures
240  
241  This section gives further details about the high level structures that
242  will be used by the protocol.
243  
244  Note that for hashes, we define custom types like `TransactionHash`,
245  but here we will just use the raw byte representation `[u8; 32]`.
246  
247  | Index         | Type  | Description                               |
248  |---------------|-------|-------------------------------------------|
249  | `block_index` | `u32` | Block height                              |
250  | `tx_index`    | `u16` | Index of a tx within a block              |
251  | `call_index`  | `u8`  | Index of contract call within a single tx |
252  
253  `u32` can store 4.29 billion blocks, which with a 90 second blocktime
254  corresponds to 12.2k years.
255  
256  `u16` max value 65535 which is far above the expected limits. By
257  comparison the tx in Bitcoin with the most outputs has 2501.
258  
259  ## Header
260  
261  | Field               | Type       | Description                                              |
262  |---------------------|------------|----------------------------------------------------------|
263  | `version`           | `u8`       | Block version                                            |
264  | `previous`          | `[u8; 32]` | Previous block hash                                      |
265  | `height`            | `u32`      | Block height                                             |
266  | `timestamp`         | `u64`      | Block creation timestamp                                 |
267  | `nonce`             | `u64`      | The block's nonce value                                  |
268  | `transactions_root` | `[u8; 32]` | Merkle tree root of the block's transactions hashes      |
269  | `state_root`        | `[u8; 32]` | Contracts states Monotree(SMT) root the block commits to |
270  | `pow_data`          | `Enum`     | Block Proof of Work type                                 |
271  
272  ## Block
273  
274  | Field       | Type           | Description              |
275  |-------------|----------------|--------------------------|
276  | `header`    | `[u8; 32]`     | Block header hash        |
277  | `txs`       | `Vec<[u8; 32]` | Transaction hashes       |
278  | `signature` | `Signature`    | Block producer signature |
279  
280  ## Blockchain
281  
282  | Field    | Type         | Description                                |
283  |----------|--------------|--------------------------------------------|
284  | `blocks` | `Vec<Block>` | Series of blocks consisting the Blockchain |
285  | `module` | `PoWModule`  | Blocks difficulties state used by RandomX  |
286  
287  ## Fork
288  
289  | Field       | Type           | Description                      |
290  |-------------|----------------|----------------------------------|
291  | `chain`     | `Blockchain`   | Forks current blockchain state   |
292  | `proposals` | `Vec<[u8; 32]` | Fork proposal hashes sequence    |
293  | `mempool`   | `Vec<[u8; 32]` | Valid pending transaction hashes |
294  
295  ## Validator
296  
297  | Field       | Type              | Description                            |
298  |-------------|-------------------|----------------------------------------|
299  | `canonical` | `Blockchain`      | Canonical (confirmed) blockchain       |
300  | `forks`     | `Vec<Blockchain>` | Fork chains containing block proposals |
301  
302  ## Contracts states Monotree(SMT)
303  
304  DarkFi is using an optimized Sparse Merkle Tree implementation,
305  called Monotree.
306  This tree is constructed using all contracts states trees checksums,
307  along with the wasm bincodes tree checksum, excluding native contracts
308  zkas tree and wasm bincodes.
309  The checksum of each tree is computed by hashing all its key and values,
310  using a `blake3` hasher.
311  For each block, we compute the current tree root and keep it in its header,
312  enabling us to both verify the contacts state after the block insertion,
313  and create proofs commiting to that specific state.
314  
315  ### Usage example
316  
317  In this pseudocode we can see how we can use the Monotree to produce
318  a proof for a specific state. First, we will create a random set of
319  key-value `blake3` hash pairs, representing our contract states
320  checksums:
321  
322  ```
323  keys = random_hashes(100);
324  values = random_hashes(100);
325  ```
326  
327  We compute our monotree, by inserting all the pairs:
328  ```
329  root = None;
330  tree = Monotree::new();
331  for (key, value) in keys.zip(values) {
332      root = tree.insert(root, key, value);
333      tree.set_headroot(root);
334  }
335  ```
336  
337  Each node in the network holds a similar tree agains the current
338  database state, as described earlier. On each new block, we commit
339  to the state by appending the monotree root to the block header.
340  Lets assume we now created a block commiting to that state,
341  and want to get a Merkle proof for a specific key-value pair
342  we want to commit to:
343  ```
344  root = tree.get_headroot()?
345  block.header.state_root = root;
346  proof = tree.get_merkle_proof(root, our_key);
347  ```
348  
349  No matter how many blocks have passed, or how the state/tree has
350  mutated, we can always verify that proof against the specific block
351  it corresponds, by using its header root:
352  ```
353  assert!(verify_proof(block.header.state_root, our_key_value, proof));
354  ```
355  
356  # Appendix: Ranking Blocks
357  
358  ## Sequences
359  
360  Denote blocks by the symbols $bᵢ ∈ B$, then a sequence of blocks
361  (alternatively a fork) is an ordered series $𝐛 = (b₁, …, bₘ)$.
362  
363  Use $S$ for all sets of sequences for blocks in $B$.
364  
365  ## Properties for Rank
366  
367  Each block is associated with a target $T : B → 𝕀$ where $𝕀 ⊂ ℕ$.
368  
369  1. Blocks with lower targets are harder to create and ranked higher in
370     a sequence of blocks.
371  2. Given two competing forks $𝐚 = (a₁, …, aₘ)$ and $b = (b₁, …, bₙ)$,
372     we wish to select a winner. Assume $𝐚$ is the winner, then
373     $∑ T(aᵢ) ≤ ∑ T(bᵢ)$.
374  3. There should only ever be a single winner.
375     When $∑ T(aᵢ) = ∑ T(bᵢ)$, then we have logic to break the tie.
376  
377  Property (2) can also be statistically true for $p > 0.5$.
378  
379  This is used to define a *fork-ranking* function $W : S → ℕ$.
380  This function must *always* have unique values for distinct sequences.
381  
382  ### Additivity
383  
384  We also would like the property $W$ is additive on subsequences
385  $$ W((b₁, …, bₘ)) = W((b₁)) + ⋯ + W((bₘ)) $$
386  which allows comparing forks from any point within the blockchain. For
387  example let $𝐬 = (s₁, …, sₖ)$ be the blockchain together with forks
388  $𝐚, 𝐛$ extending $𝐬$ into $𝐬 ⊕  𝐚 = (s₁, …, sₖ, a₁, …, aₘ)$ and
389  $𝐬 ⊕  𝐛 = (s₁, …, sₖ, b₁, …, bₙ)$. Then we have that
390  $$ W(𝐬 ⊕  𝐚) < W(𝐬 ⊕  𝐛) ⟺  W(𝐚) < W(𝐛) $$
391  which means it's sufficient to compare $𝐚$ and $𝐛$ directly.
392  
393  ## Proposed Rank
394  
395  With a PoW mining system, we are guaranteed to always have the block
396  hash $h(b) ≤ T(b)$. Since the block hashes $( h(b₁), …, h(bₘ) )$ for a
397  sequence $( b₁, …, bₘ )$ have the property that $∑ h(bᵢ) ≤ ∑ T(bᵢ)$,
398  as well as being sufficiently random, we can use them to define our
399  work function.
400  
401  Because $W$ is required to be additive, we define a block work function
402  $w : B → ℕ$, and $W(𝐛) = ∑ w(bᵢ)$.
403  
404  The block work function should have a statistically higher score for
405  blocks with a smaller target, and always be distinct for unique blocks.
406  We define $w$ as
407  $$ w(b) = \max(𝕀) - h(b) $$
408  since $h(b) < T(b) < \max(𝕀)$ this function is well defined on the
409  codomain.
410  
411  ## Hash Function
412  
413  Let $𝕀$ be a fixed subset of $ℕ$ representing the output of a hash
414  function $[0, \max(𝕀)]$.
415  
416  **Definition:** a *hash function* is a function $H : ℕ → 𝕀$ having the
417  following properties:
418  
419  1. *Uniformity*, for any $y ∈ 𝕀$ and any $n ∈ ℕ$, there exists an $N > n$
420     such that $H(N) = y$.
421  2. *One-way*, for any $y ∈ 𝕀$, we are unable to construct an $x ∈ ℕ$
422     such that $H(x) = y$.
423  
424  Note: the above notions rely on purely algebraic properties of $H$
425  without requiring the machinery of probability. The second property of
426  being one-way is a stronger notion than $\ran(H)$ being statistically
427  random. Indeed if the probability is non-zero then we could find such
428  an $(x, y)$ which breaks the one-way property.
429  
430  **Theorem:** *given a hash function $H : ℕ → 𝕀$ as defined above, it's
431  impossible to construct two distinct sequences $𝐚 = (a₁, …, aₘ)$ and
432  $𝐛 = (b₁, …, bₙ)$ such that $H(a₁) + ⋯ + H(aₘ) = H(b₁) + ⋯ + H(bₙ)$.*
433  
434  By property (2), we cannot find a $H(x) = 0$.
435  Again by (2), we cannot construct an $x$ such that $H(x) + H(a) = H(b)$
436  for any $a, b ∈ ℕ$. Recursive application of (2) leads us to the stated
437  theorem.
438