consensus.md
1 # Consensus 2 3 This section of the book describes how nodes participating in the DarkFi 4 blockchain achieve consensus. 5 6 ## Glossary 7 8 | Name | Description | 9 |------------------------|----------------------------------------------------------------------------------------| 10 | Consensus | Algorithm for reaching blockchain consensus between participating nodes | 11 | Node/Validator | DarkFi daemon participating in the network | 12 | Miner | Block producer | 13 | Unproposed Transaction | Transaction that exists in the memory pool but has not yet been included in a proposal | 14 | Block proposal | Block that has not yet been appended onto the canonical blockchain | 15 | P2P network | Peer-to-peer network on which nodes communicate with each other | 16 | Confirmation | State achieved when a block and its contents are appended to the canonical blockchain | 17 | Fork | Chain of block proposals that begins with the last block of the canonical blockchain | 18 | MAX_INT | The maximum 32 bytes (256 bits) integer 2^256 − 1 | 19 20 ## Miner main loop 21 22 DarkFi uses RandomX Proof of Work algorithm. 23 Therefore, block production involves the following steps: 24 25 * First, a miner grabs its current best ranking fork and extends it 26 with a block composed of unproposed transactions from the miner's 27 mempool. 28 29 * Then the miner tries to find a nonce such that when the block header 30 is hashed its bytes produce a number that is less than the current 31 difficulty target of the network, using the [RandomX mining algorithm](https://github.com/tevador/RandomX). 32 33 * Once the miner finds such a nonce, it broadcasts its block proposal 34 to the P2P network. Finally the miner triggers a confirmation check 35 to see if its newly extended fork can be confirmed. 36 37 Pseudocode: 38 ``` 39 loop { 40 fork = find_best_fork() 41 42 block = generate_next_block(fork) 43 44 mine_block(block) 45 46 p2p.broadcast_proposal(block) 47 48 fork.append_proposal(block) 49 50 chain_confirmation() 51 } 52 ``` 53 54 ## Listening for block proposals 55 56 Each node listens for new block proposals on the P2P network. Upon 57 receiving block proposals, nodes try to extend the proposals onto a 58 fork held in memory (this process is described in the next section). 59 Then nodes trigger a confirmation check to see if their newly extended 60 fork can be confirmed. 61 62 Upon receiving a new block proposal, miners also check if the extended 63 fork rank is better than the one they are currently trying to extend. 64 If the fork rank is better, the miner will stop mining its proposal and 65 start mining the new best fork. 66 67 ## Ranking 68 69 Each block proposal is ranked based on how hard it is to produce. To 70 measure that, we compute the squared distance of its height target from 71 `MAX_INT`. For two honest nodes that mine the next block height of the 72 highest ranking fork, their block will have the same rank. To mitigate 73 this tie scenario, we also compute the squared distance of the blocks 74 `RandomX` hash from `MAX_INT`, allowing us to always choose the actual 75 higher ranking block for that height, in case of ties. The complete 76 block rank is a tuple containing both squared distances. 77 78 Proof of Work algorithm lowers the difficulty target as hashpower grows. 79 This means that blocks will have to be mined for a lower target, 80 therefore rank higher, as they go further away from `MAX_INT`. 81 82 Similar to blocks, blockchain/forks rank is a tuple, with the first 83 part being the sum of its block's squared target distances, and the 84 second being the sum of their squared hash distances. Squared distances 85 are used to disproportionately favors smaller targets, with the idea 86 being that it will be harder to trigger a longer reorg between forks. 87 When we compare forks, we first check the first sum, and if it's tied, 88 we use the second as the tie breaker, since we know it will be 89 statistically unique for each sequence. 90 91 The ranking of a fork is always increasing as new blocks are appended. 92 To see this, let $F = (M₁ ⋯ Mₙ)$ be a fork with a finite sequence of 93 blocks $(Mᵢ)$ of length $n$. The rank of a fork is calculated as 94 $$ r_F = n ∑ᵢ₌₁ⁿ \t{rank}(Mᵢ) $$ 95 Let $F' = F ⊕ (Mₙ₊₁)$ of length $n + 1$ be the fork created by 96 appending the block $Mₙ₊₁$ to $F$. Then we see that 97 $$ r_{F'} > r_F $$ 98 since $\t{rank}(M) > 0$ for all $M$. 99 100 ## Fork extension 101 102 Since there can be more than one block producer, each node holds a set 103 of known forks in memory. Nodes extend the best ranking fork in memory 104 when producing a block. 105 106 Upon receiving a block, one of the following cases may occur: 107 108 | Description | Handling | 109 |-----------------------------------------------|---------------------------------------------------------------------| 110 | Block extends a known fork at its end | Append block to fork | 111 | Block extends a known fork not at its end | Create a new fork up to the extended block and append the new block | 112 | Block extends canonical blockchain at its end | Create a new fork containing the new block | 113 | Block doesn't extend any known chain | Check if a reorg should be executed | 114 115 ### Visual Examples 116 117 | Symbol | Description | 118 |---------------|----------------------------------------| 119 | [C] | Canonical (confirmed) blockchain block | 120 | [C]--...--[C] | Sequence of canonical blocks | 121 | [Mn] | Proposal produced by Miner n | 122 | Fn | Fork name to identify them in examples | 123 | +-- | Appending a block to fork | 124 | /-- | Dropped fork | 125 126 Starting state: 127 128 |--[M0] <-- F0 129 [C]--...--[C]--| 130 |--[M1] <-- F1 131 132 Blocks on same Y axis have the same height. 133 134 #### Case 1 135 136 Extending F0 fork with a new block proposal: 137 138 |--[M0]+--[M2] <-- F0 139 [C]--...--[C]--| 140 |--[M1] <-- F1 141 142 #### Case 2 143 144 Extending F0 fork at [M0] block with a new block proposal, creating a 145 new fork chain: 146 147 |--[M0]--[M2] <-- F0 148 [C]--...--[C]--| 149 |--[M1] <-- F1 150 | 151 |+--[M0]+--[M3] <-- F2 152 153 ##### Case 3 154 155 Extending the canonical blockchain with a new block proposal: 156 157 |--[M0]--[M2] <-- F0 158 [C]--...--[C]--| 159 |--[M1] <-- F1 160 | 161 |--[M0]--[M3] <-- F2 162 | 163 |+--[M4] <-- F3 164 165 ##### Case 4 166 167 Reorg happened and we rebuild the chain: 168 169 |--[M0]--[M2] <-- F0 170 [C]--...--[C]/--...--[C]--| 171 | |--[M1] <-- F1 172 | | 173 | |--[M0]--[M3] <-- F2 174 | | 175 | |--[M4] <-- F3 176 | 177 ...--...--[C]--|+--[M5] <-- F4 178 179 180 ## Confirmation 181 182 Based on the rank properties, each node will diverge to the highest 183 ranking fork, and new fork will emerge extending that at its tips. 184 A security threshold is set, which refers to the height where the 185 probability to produce a fork, able to reorg the current best ranking 186 fork reaches zero, similar to the # of block confirmation used by other 187 PoW based protocols. 188 189 When the confirmation check kicks in, each node will grab its best fork. 190 If the fork's length exceeds the security threshold, the node will push 191 (confirm) its first proposal to the canonical blockchain. The fork acts 192 as a queue (buffer) for the to-be-confirmed proposals. 193 194 Once a confirmation occurs, all the fork chains not starting with the 195 confirmed block(s) are removed from the node's memory pool. 196 197 We continue Case 3 from the previous section to visualize this logic. 198 199 The confirmation threshold used in this example is 3 blocks. A node 200 observes 2 proposals. One extends the F0 fork and the other extends 201 the F2 fork: 202 203 |--[M0]--[M2]+--[M5] <-- F0 204 [C]--...--[C]--| 205 |--[M1] <-- F1 206 | 207 |--[M0]--[M3]+--[M6] <-- F2 208 | 209 |--[M4] <-- F3 210 211 212 Later, the node only observes 1 proposal, extending the F2 fork: 213 214 |--[M0]--[M2]--[M5] <-- F0 215 [C]--...--[C]--| 216 |--[M1] <-- F1 217 | 218 |--[M0]--[M3]--[M6]+--[M7] <-- F2 219 | 220 |--[M4] <-- F3 221 222 When the confirmation sync period starts, the node confirms block M0 and 223 keeps the forks that extend that: 224 225 |--[M0]--[M2]--[M5] <-- F0 226 [C]--...--[C]--| 227 |/--[M1] <-- F1 228 | 229 |--[M0]--[M3]--[M6]--[M7] <-- F2 230 | 231 |/--[M4] <-- F3 232 233 The canonical blockchain now contains blocks M0 and the current state is: 234 235 |--[M2]--[M5] <-- F0 236 [C]--...--[C]--| 237 |--[M3]--[M6]--[M7] <-- F2 238 239 # Appendix: Data Structures 240 241 This section gives further details about the high level structures that 242 will be used by the protocol. 243 244 Note that for hashes, we define custom types like `TransactionHash`, 245 but here we will just use the raw byte representation `[u8; 32]`. 246 247 | Index | Type | Description | 248 |---------------|-------|-------------------------------------------| 249 | `block_index` | `u32` | Block height | 250 | `tx_index` | `u16` | Index of a tx within a block | 251 | `call_index` | `u8` | Index of contract call within a single tx | 252 253 `u32` can store 4.29 billion blocks, which with a 90 second blocktime 254 corresponds to 12.2k years. 255 256 `u16` max value 65535 which is far above the expected limits. By 257 comparison the tx in Bitcoin with the most outputs has 2501. 258 259 ## Header 260 261 | Field | Type | Description | 262 |---------------------|------------|----------------------------------------------------------| 263 | `version` | `u8` | Block version | 264 | `previous` | `[u8; 32]` | Previous block hash | 265 | `height` | `u32` | Block height | 266 | `timestamp` | `u64` | Block creation timestamp | 267 | `nonce` | `u64` | The block's nonce value | 268 | `transactions_root` | `[u8; 32]` | Merkle tree root of the block's transactions hashes | 269 | `state_root` | `[u8; 32]` | Contracts states Monotree(SMT) root the block commits to | 270 | `pow_data` | `Enum` | Block Proof of Work type | 271 272 ## Block 273 274 | Field | Type | Description | 275 |-------------|----------------|--------------------------| 276 | `header` | `[u8; 32]` | Block header hash | 277 | `txs` | `Vec<[u8; 32]` | Transaction hashes | 278 | `signature` | `Signature` | Block producer signature | 279 280 ## Blockchain 281 282 | Field | Type | Description | 283 |----------|--------------|--------------------------------------------| 284 | `blocks` | `Vec<Block>` | Series of blocks consisting the Blockchain | 285 | `module` | `PoWModule` | Blocks difficulties state used by RandomX | 286 287 ## Fork 288 289 | Field | Type | Description | 290 |-------------|----------------|----------------------------------| 291 | `chain` | `Blockchain` | Forks current blockchain state | 292 | `proposals` | `Vec<[u8; 32]` | Fork proposal hashes sequence | 293 | `mempool` | `Vec<[u8; 32]` | Valid pending transaction hashes | 294 295 ## Validator 296 297 | Field | Type | Description | 298 |-------------|-------------------|----------------------------------------| 299 | `canonical` | `Blockchain` | Canonical (confirmed) blockchain | 300 | `forks` | `Vec<Blockchain>` | Fork chains containing block proposals | 301 302 ## Contracts states Monotree(SMT) 303 304 DarkFi is using an optimized Sparse Merkle Tree implementation, 305 called Monotree. 306 This tree is constructed using all contracts states trees checksums, 307 along with the wasm bincodes tree checksum, excluding native contracts 308 zkas tree and wasm bincodes. 309 The checksum of each tree is computed by hashing all its key and values, 310 using a `blake3` hasher. 311 For each block, we compute the current tree root and keep it in its header, 312 enabling us to both verify the contacts state after the block insertion, 313 and create proofs commiting to that specific state. 314 315 ### Usage example 316 317 In this pseudocode we can see how we can use the Monotree to produce 318 a proof for a specific state. First, we will create a random set of 319 key-value `blake3` hash pairs, representing our contract states 320 checksums: 321 322 ``` 323 keys = random_hashes(100); 324 values = random_hashes(100); 325 ``` 326 327 We compute our monotree, by inserting all the pairs: 328 ``` 329 root = None; 330 tree = Monotree::new(); 331 for (key, value) in keys.zip(values) { 332 root = tree.insert(root, key, value); 333 tree.set_headroot(root); 334 } 335 ``` 336 337 Each node in the network holds a similar tree agains the current 338 database state, as described earlier. On each new block, we commit 339 to the state by appending the monotree root to the block header. 340 Lets assume we now created a block commiting to that state, 341 and want to get a Merkle proof for a specific key-value pair 342 we want to commit to: 343 ``` 344 root = tree.get_headroot()? 345 block.header.state_root = root; 346 proof = tree.get_merkle_proof(root, our_key); 347 ``` 348 349 No matter how many blocks have passed, or how the state/tree has 350 mutated, we can always verify that proof against the specific block 351 it corresponds, by using its header root: 352 ``` 353 assert!(verify_proof(block.header.state_root, our_key_value, proof)); 354 ``` 355 356 # Appendix: Ranking Blocks 357 358 ## Sequences 359 360 Denote blocks by the symbols $bᵢ ∈ B$, then a sequence of blocks 361 (alternatively a fork) is an ordered series $𝐛 = (b₁, …, bₘ)$. 362 363 Use $S$ for all sets of sequences for blocks in $B$. 364 365 ## Properties for Rank 366 367 Each block is associated with a target $T : B → 𝕀$ where $𝕀 ⊂ ℕ$. 368 369 1. Blocks with lower targets are harder to create and ranked higher in 370 a sequence of blocks. 371 2. Given two competing forks $𝐚 = (a₁, …, aₘ)$ and $b = (b₁, …, bₙ)$, 372 we wish to select a winner. Assume $𝐚$ is the winner, then 373 $∑ T(aᵢ) ≤ ∑ T(bᵢ)$. 374 3. There should only ever be a single winner. 375 When $∑ T(aᵢ) = ∑ T(bᵢ)$, then we have logic to break the tie. 376 377 Property (2) can also be statistically true for $p > 0.5$. 378 379 This is used to define a *fork-ranking* function $W : S → ℕ$. 380 This function must *always* have unique values for distinct sequences. 381 382 ### Additivity 383 384 We also would like the property $W$ is additive on subsequences 385 $$ W((b₁, …, bₘ)) = W((b₁)) + ⋯ + W((bₘ)) $$ 386 which allows comparing forks from any point within the blockchain. For 387 example let $𝐬 = (s₁, …, sₖ)$ be the blockchain together with forks 388 $𝐚, 𝐛$ extending $𝐬$ into $𝐬 ⊕ 𝐚 = (s₁, …, sₖ, a₁, …, aₘ)$ and 389 $𝐬 ⊕ 𝐛 = (s₁, …, sₖ, b₁, …, bₙ)$. Then we have that 390 $$ W(𝐬 ⊕ 𝐚) < W(𝐬 ⊕ 𝐛) ⟺ W(𝐚) < W(𝐛) $$ 391 which means it's sufficient to compare $𝐚$ and $𝐛$ directly. 392 393 ## Proposed Rank 394 395 With a PoW mining system, we are guaranteed to always have the block 396 hash $h(b) ≤ T(b)$. Since the block hashes $( h(b₁), …, h(bₘ) )$ for a 397 sequence $( b₁, …, bₘ )$ have the property that $∑ h(bᵢ) ≤ ∑ T(bᵢ)$, 398 as well as being sufficiently random, we can use them to define our 399 work function. 400 401 Because $W$ is required to be additive, we define a block work function 402 $w : B → ℕ$, and $W(𝐛) = ∑ w(bᵢ)$. 403 404 The block work function should have a statistically higher score for 405 blocks with a smaller target, and always be distinct for unique blocks. 406 We define $w$ as 407 $$ w(b) = \max(𝕀) - h(b) $$ 408 since $h(b) < T(b) < \max(𝕀)$ this function is well defined on the 409 codomain. 410 411 ## Hash Function 412 413 Let $𝕀$ be a fixed subset of $ℕ$ representing the output of a hash 414 function $[0, \max(𝕀)]$. 415 416 **Definition:** a *hash function* is a function $H : ℕ → 𝕀$ having the 417 following properties: 418 419 1. *Uniformity*, for any $y ∈ 𝕀$ and any $n ∈ ℕ$, there exists an $N > n$ 420 such that $H(N) = y$. 421 2. *One-way*, for any $y ∈ 𝕀$, we are unable to construct an $x ∈ ℕ$ 422 such that $H(x) = y$. 423 424 Note: the above notions rely on purely algebraic properties of $H$ 425 without requiring the machinery of probability. The second property of 426 being one-way is a stronger notion than $\ran(H)$ being statistically 427 random. Indeed if the probability is non-zero then we could find such 428 an $(x, y)$ which breaks the one-way property. 429 430 **Theorem:** *given a hash function $H : ℕ → 𝕀$ as defined above, it's 431 impossible to construct two distinct sequences $𝐚 = (a₁, …, aₘ)$ and 432 $𝐛 = (b₁, …, bₙ)$ such that $H(a₁) + ⋯ + H(aₘ) = H(b₁) + ⋯ + H(bₙ)$.* 433 434 By property (2), we cannot find a $H(x) = 0$. 435 Again by (2), we cannot construct an $x$ such that $H(x) + H(a) = H(b)$ 436 for any $a, b ∈ ℕ$. Recursive application of (2) leads us to the stated 437 theorem. 438