state-machine-dsl.md
1 # Cornell State Machine DSL 2 3 Cornell provides a verified state machine DSL for modeling protocol handshakes with machine-checked properties. The key insight: **pattern matching on (state, event) pairs makes invalid transitions compile errors**. 4 5 ## Design Philosophy 6 7 Traditional state machines are error-prone: 8 - Forgotten transitions lead to runtime crashes 9 - Invalid state/event combinations silently do nothing 10 - No guarantee transitions are deterministic 11 - Testing can't cover all paths 12 13 Cornell fixes this by making state machines **total functions** that the type checker verifies. 14 15 ## Core Types 16 17 ```lean 18 /-- A transition produces a new state and a list of actions -/ 19 structure Transition (S A : Type) where 20 next : S 21 actions : List A 22 23 /-- A state machine definition -/ 24 structure Machine (S E A : Type) where 25 initial : S 26 transition : S → E → Transition S A 27 isTerminal : S → Bool 28 ``` 29 30 The `transition` function must handle **every** (state, event) pair. Lean enforces this at compile time. 31 32 ## Spelling a State Machine 33 34 ### 1. Define States (with context) 35 36 States carry the data they need. No separate "context" object - the state IS the context: 37 38 ```lean 39 inductive ServerState where 40 | init (config : HandshakeConfig) 41 | versioned (config : HandshakeConfig) (negotiated : ProtocolVersion) 42 | features (config : HandshakeConfig) (negotiated : ProtocolVersion) (active : List Feature) 43 | upgrading (config : HandshakeConfig) (negotiated : ProtocolVersion) (active : List Feature) 44 | nixReady (version : ProtocolVersion) 45 | reapiReady (config : ReapiConfig) 46 | failed (reason : String) 47 ``` 48 49 Notice: 50 - `init` only has config (nothing negotiated yet) 51 - `versioned` adds the negotiated version 52 - `upgrading` adds active features 53 - Terminal states (`nixReady`, `reapiReady`, `failed`) only carry final result 54 55 ### 2. Define Events (inputs) 56 57 Events are what the external world sends: 58 59 ```lean 60 inductive ServerEvent where 61 | clientHello (clientVersion : ProtocolVersion) 62 | clientLegacy 63 | clientFeatures (features : List Feature) 64 | clientUpgradeResponse (accept : Bool) 65 ``` 66 67 Events carry their payload. `clientHello` has a version, `clientUpgradeResponse` has a boolean. 68 69 ### 3. Define Actions (outputs) 70 71 Actions are what we emit in response: 72 73 ```lean 74 inductive ServerAction where 75 | sendServerHello (version : ProtocolVersion) 76 | sendDaemonVersion (version : String) 77 | sendTrustLevel (level : TrustLevel) 78 | sendFeatures (features : List Feature) 79 | sendUpgradeOffer 80 | sendReapiConfig (config : ReapiConfig) 81 | ready 82 | fail (reason : String) 83 ``` 84 85 ### 4. Define Transitions (the logic) 86 87 Pattern match on (state, event) pairs: 88 89 ```lean 90 def serverTransition : ServerState → ServerEvent → Transition ServerState ServerAction 91 92 -- INIT + clientHello → VERSIONED 93 | .init config, .clientHello clientVer => 94 let negotiated := min clientVer config.serverVersion 95 { next := .versioned config negotiated 96 , actions := [.sendServerHello config.serverVersion] } 97 98 -- VERSIONED + clientFeatures → UPGRADING or NIX_READY 99 | .versioned config negotiated, .clientFeatures clientFeatures => 100 let active := featureIntersection config.serverFeatures clientFeatures 101 if shouldOfferUpgrade config active then 102 { next := .upgrading config negotiated active 103 , actions := [.sendFeatures config.serverFeatures, .sendUpgradeOffer] } 104 else 105 { next := .nixReady negotiated 106 , actions := [.sendFeatures config.serverFeatures, .ready] } 107 108 -- Terminal states: absorb all events 109 | .nixReady _, _ => { next := .failed "Already terminal", actions := [] } 110 | .reapiReady _, _ => { next := .failed "Already terminal", actions := [] } 111 | .failed r, _ => { next := .failed r, actions := [] } 112 113 -- Catch-all: invalid transition 114 | _, _ => { next := .failed "Invalid transition", actions := [.fail "Invalid"] } 115 ``` 116 117 ## Properties We Get For Free 118 119 ### Determinism 120 121 The transition function IS deterministic by construction - it's a function, not a relation: 122 123 ```lean 124 theorem server_deterministic (s : ServerState) (e : ServerEvent) : 125 ∃! t, serverTransition s e = t := by 126 exact ⟨serverTransition s e, rfl, fun _ h => h.symm⟩ 127 ``` 128 129 ### Totality 130 131 Every (state, event) pair has a handler. Lean's exhaustiveness checker enforces this. If you remove the catch-all `| _, _ =>` case, Lean shows you exactly which cases are missing. 132 133 ### Terminal State Absorption 134 135 Terminal states don't transition to non-terminal states: 136 137 ```lean 138 theorem server_terminal_absorbing (v : ProtocolVersion) (e : ServerEvent) : 139 (serverTransition (.nixReady v) e).next.isTerminal = true 140 ``` 141 142 ## Visual Representation 143 144 ``` 145 clientHello 146 ┌─────────────────────────────────┐ 147 │ ▼ 148 ┌──────┐ ┌────────────┐ 149 │ INIT │ │ VERSIONED │ 150 └──────┘ └─────┬──────┘ 151 │ 152 ┌────────────────┴────────────────┐ 153 │ clientLegacy clientFeatures 154 │ (v < 1.38) (v >= 1.38) 155 ▼ ▼ 156 ┌───────────┐ ┌─────────────┐ 157 │ NIX_READY │◄───────────────────│ UPGRADING │ 158 └───────────┘ reject └──────┬──────┘ 159 ▲ │ accept 160 │ ▼ 161 │ ┌─────────────┐ 162 └──────────────────────────│ REAPI_READY │ 163 (no REAPI config) └─────────────┘ 164 ``` 165 166 ## C++ Code Generation 167 168 The DSL generates `std::variant`-based state machines: 169 170 ```cpp 171 // Generated states 172 using ServerHandshake = std::variant< 173 SrvInit, 174 SrvVersioned, 175 SrvFeatures, 176 SrvUpgrading, 177 SrvNixReady, 178 SrvReapiReady, 179 SrvFailed 180 >; 181 182 // Generated transition function 183 std::pair<ServerHandshake, std::vector<ServerAction>> 184 server_step(const ServerHandshake& state, const ServerEvent& event) { 185 186 if (auto* s = std::get_if<SrvInit>(&state)) { 187 if (auto* e = std::get_if<SrvEvClientHello>(&event)) { 188 ProtocolVersion negotiated = std::min(client_ver, s->config.server_version); 189 return { 190 SrvVersioned{s->config, negotiated}, 191 {SrvActSendServerHello{s->config.server_version}} 192 }; 193 } 194 } 195 // ... other cases ... 196 } 197 ``` 198 199 The generated C++ mirrors the Lean structure exactly, providing the same safety guarantees at the type level. 200 201 ## Running the Machine 202 203 ```lean 204 /-- Step a machine: apply event to current state -/ 205 def Machine.step (m : Machine S E A) (s : S) (e : E) : S × List A := 206 let t := m.transition s e 207 (t.next, t.actions) 208 209 /-- Run a sequence of events -/ 210 def Machine.run (m : Machine S E A) (events : List E) : S × List A := 211 events.foldl (fun (s, actions) e => 212 let (s', newActions) := m.step s e 213 (s', actions ++ newActions) 214 ) (m.initial, []) 215 ``` 216 217 Example trace: 218 219 ```lean 220 def exampleReapiHandshake : List ServerEvent := 221 [ .clientHello ProtocolVersion.current 222 , .clientFeatures [.reapiV2, .casSha256] 223 , .clientUpgradeResponse true 224 ] 225 226 #eval 227 let m := serverHandshake HandshakeConfig.default 228 m.run exampleReapiHandshake 229 -- (.reapiReady {instanceName := "main", digestFunction := 0}, 230 -- [.sendServerHello ..., .sendFeatures ..., .sendUpgradeOffer, .sendReapiConfig ..., .ready]) 231 ``` 232 233 ## Combinators 234 235 Cornell provides combinators for composing state machines: 236 237 ### Product 238 239 Run two machines in parallel on their respective events: 240 241 ```lean 242 def Machine.product (m1 : Machine S1 E1 A1) (m2 : Machine S2 E2 A2) 243 : Machine (S1 × S2) (Either E1 E2) (Either A1 A2) 244 ``` 245 246 ### Sum 247 248 Choose between two machines based on initial event: 249 250 ```lean 251 def Machine.sum (m1 : Machine S1 E1 A1) (m2 : Machine S2 E2 A2) 252 : Machine (SumState S1 S2) (Either E1 E2) (Either A1 A2) 253 ``` 254 255 ### Sequential 256 257 Run m1 until terminal, then switch to m2 via handoff event: 258 259 ```lean 260 def Machine.sequential (m1 : Machine S1 E1 A1) (m2 : Machine S2 E2 A2) 261 : Machine (SeqState S1 S2) (SeqEvent E1 E2) (Either A1 A2) 262 ``` 263 264 Example: Compose handshake and operations: 265 266 ```lean 267 def daemonMachine (config : HandshakeConfig) 268 : Machine (SeqState ServerState DaemonOpState) DaemonEvent DaemonAction := 269 (serverHandshake config).sequential (daemonOps config.serverVersion) 270 ``` 271 272 ### Other Combinators 273 274 - `mapActions` - Transform actions without changing state 275 - `filterActions` - Only emit actions satisfying a predicate 276 - `extendState` - Add metadata that doesn't affect transitions 277 - `withState` - Modify extended state based on transitions 278 - `lift` - Transform state/events/actions through functions 279 280 ## Daemon Operation State Machine 281 282 After handshake completes, the daemon processes operations: 283 284 ``` 285 AWAITING_OP → PROCESSING → SENDING_STDERR → SENDING_RESULT → OP_COMPLETE 286 ↑ │ 287 └────────────────────────────────────────────────────────────┘ 288 ``` 289 290 States carry the protocol version and current operation: 291 292 ```lean 293 inductive DaemonOpState where 294 | awaitingOp (version : ProtocolVersion) 295 | processing (version : ProtocolVersion) (op : WorkerOp) 296 | sendingStderr (version : ProtocolVersion) (op : WorkerOp) 297 | sendingResult (version : ProtocolVersion) (op : WorkerOp) 298 | opComplete (version : ProtocolVersion) 299 | opFailed (version : ProtocolVersion) (reason : String) 300 ``` 301 302 ## Stderr Framing State Machine 303 304 Within an operation, stderr messages follow this protocol: 305 306 ```lean 307 inductive StderrState where 308 | idle 309 | streaming (pending : Nat) 310 | waitingInput 311 | complete 312 | errored (reason : String) 313 ``` 314 315 Message types: `STDERR_NEXT`, `STDERR_READ`, `STDERR_WRITE`, `STDERR_LAST`, `STDERR_ERROR`, plus activity messages (≥1.20). 316 317 ## Why This Design? 318 319 1. **States are tagged unions** - Each state variant carries exactly the data it needs 320 2. **Transitions are total functions** - Compiler catches missing cases 321 3. **Actions are first-class** - Easy to serialize/test/mock 322 4. **Proofs are cheap** - Determinism is trivial, other properties follow 323 5. **C++ extraction preserves structure** - Same `std::variant` pattern 324 6. **Composable via combinators** - Build complex protocols from simple pieces 325 326 This is how you build protocols that don't have weird edge cases or forgotten transitions. If it compiles, the state machine is well-formed.