/ docs / abzu_handbook / 03_Architecture.md
03_Architecture.md
  1  # Abzu Architecture Specification
  2  
  3  > **Version**: 0.6.1  
  4  > **Status**: Audit Remediation Complete — January 28, 2026  
  5  > **License**: Abzu Community License 1.3
  6  
  7  > [!NOTE]
  8  > For the comprehensive technical specification, see [TECHNICAL_SPECIFICATION.md](TECHNICAL_SPECIFICATION.md).  
  9  > For documentation navigation, see [INDEX.md](INDEX.md).
 10  
 11  ---
 12  
 13  ## Overview
 14  
 15  Abzu is a **sovereign mesh network protocol** built entirely in Rust. It provides censorship-resistant, privacy-preserving peer-to-peer communication that rides beneath conventional internet infrastructure—using TCP/WebSocket as transport substrates while evading surveillance and filtering.
 16  
 17  ```text
 18  ┌─────────────────────────────────────────────────────────────────────┐
 19  │                          Control Plane                              │
 20  │                       JSON-RPC 2.0 Interface                        │
 21  └───────────────────────────────┬─────────────────────────────────────┘
 22 23  ┌───────────────────────────────▼─────────────────────────────────────┐
 24  │                          abzu-daemon                                 │
 25  │                   CLI • Configuration • RPC Server                   │
 26  └───────────────────────────────┬─────────────────────────────────────┘
 27 28  ┌───────────────────────────────▼─────────────────────────────────────┐
 29  │                          abzu-sdk                                    │
 30  │              AbzuClient • AccountManager • Identity                  │
 31  └───────────────────────────────┬─────────────────────────────────────┘
 32 33           ┌──────────────────────┼──────────────────────┐
 34           │                      │                      │
 35     ┌─────▼─────┐         ┌──────▼─────┐         ┌──────▼──────┐
 36     │ abzu-core │         │abzu-router │         │abzu-transport│
 37     │───────────│         │────────────│         │─────────────│
 38     │ Node      │         │ RoutingTable│        │ AbzuFrame   │
 39     │ Switchboard│        │ TreeCoords │         │ FakeTLS     │
 40     │ Store     │         │ Path Build │         │ Cover Traffic│
 41     └───────────┘         └────────────┘         └─────────────┘
 42           │                                             │
 43     ┌─────▼─────┐                                ┌──────▼──────┐
 44     │ abzu-dht  │                                │abzu-account │
 45     │───────────│                                │─────────────│
 46     │ Kademlia  │                                │ Account     │
 47     │ K-Buckets │                                │ Manager     │
 48     │ ValueStore│                                │ Permissions │
 49     └───────────┘                                └─────────────┘
 50  ```
 51  
 52  ---
 53  
 54  ## Crate Responsibilities
 55  
 56  | Crate | Role | Key Modules |
 57  |-------|------|-------------|
 58  | **abzu-sdk** | High-level client API, identity management | `lib.rs`, `identity.rs` |
 59  | **abzu-account** | Multi-tenant account management, permissions | `account.rs`, `manager.rs` |
 60  | **abzu-core** | Node lifecycle, peer management, message routing | `node.rs`, `switchboard.rs`, `store.rs` |
 61  | **abzu-dht** | Kademlia-style DHT, peer discovery | `routing/`, `store/`, `lookup/` |
 62  | **abzu-router** | Pure-logic routing (no I/O), tree coordinates | `table.rs`, `coords.rs`, `path.rs` |
 63  | **abzu-transport** | Wire protocol, encryption, DPI evasion | `wire.rs`, `cover.rs`, `transports/fake_tls.rs` |
 64  | **abzu-daemon** | CLI binary, configuration parsing, RPC server | `main.rs` |
 65  | **abzu-ffi** | Foreign Function Interface for mobile | `lib.rs` |
 66  
 67  > [!NOTE]
 68  > The `ghost` feature flag enables cover traffic (enabled by default on desktop, excluded on mobile builds).
 69  
 70  ---
 71  
 72  ## Platform Roles
 73  
 74  Abzu nodes are differentiated by role, enabling optimized builds for distinct deployment contexts:
 75  
 76  | Role | Description | Feature Flag |
 77  |------|-------------|--------------|
 78  | **Edge** | Lightweight mobile/WASM client | `edge` |
 79  | **Desktop** | Full mesh participant (default) | — |
 80  | **Infrastructure** | Always-on home node / relay | `infrastructure` |
 81  
 82  ### Role Capabilities
 83  
 84  ```rust
 85  pub enum NodeRole { Edge, Desktop, Infrastructure }
 86  ```
 87  
 88  | Capability | Edge | Desktop | Infrastructure |
 89  |------------|:----:|:-------:|:--------------:|
 90  | Routing for others | ❌ | ✅ | ✅ |
 91  | DHT participation | ❌ | ✅ | ✅ |
 92  | Cover traffic | ❌ | ✅ | ✅ |
 93  | Background daemon | ❌ | ✅ | ✅ |
 94  | Store-and-forward | ❌ | ❌ | ✅ |
 95  | Bootstrap endpoint | ❌ | ❌ | ✅ |
 96  | Max peers | 8 | 64 | 256 |
 97  
 98  **SDK Access**: `abzu_sdk::NodeRole`
 99  
100  ---
101  
102  ## Core Components
103  
104  ### 1. Node (`abzu-core/src/node.rs`)
105  
106  The `Node` is the central state container for a mesh participant.
107  
108  ```rust
109  pub struct Node {
110      identity: SigningKey,           // Ed25519 private key
111      address: Address,               // Derived IPv6 identity
112      router: Arc<RwLock<RoutingTable>>,
113      peers: Arc<Mutex<HashMap<PeerKey, PeerConnection>>>,
114      store: Db,                      // Sled database
115      // ... 
116  }
117  ```
118  
119  **Key APIs**:
120  
121  - `Node::new(config)` — Generate fresh identity
122  - `Node::with_identity(key, config)` — Use existing keypair
123  - `add_peer()` / `remove_peer()` — Manage connections
124  - `store_content()` / `fetch_content()` — Content-addressed storage
125  - Message handling: `store_message()`, `get_messages()`, `update_message_status()`
126  
127  ### 2. Switchboard (`abzu-core/src/switchboard.rs`)
128  
129  Central frame dispatcher. Routes all incoming `AbzuFrame`s.
130  
131  ```rust
132  pub enum RouteResult {
133      Local,           // Processed locally
134      Forwarded(PeerKey),
135      NoRoute,
136      Error(NodeError),
137  }
138  ```
139  
140  **Frame Handlers**:
141  
142  | Frame Type | Handler | Behavior |
143  |------------|---------|----------|
144  | `KeepAlive` | `touch_peer()` | Update activity timestamp |
145  | `Chunk` | `handle_chunk()` | Verify CID, store, notify waiters |
146  | `Request` | `handle_request()` | Lookup content, reply or forward |
147  | `Route` | `handle_route()` | Multi-hop unwrap/forward (onion) |
148  | `Chat` | `handle_chat()` | Decrypt, store, ACK |
149  | `Cover` | *silent drop* | Discarded post-decryption (Ghost mode) |
150  
151  ### 3. Content Store (`abzu-core/src/store.rs`)
152  
153  BLAKE3-based content-addressed storage using Sled embedded database.
154  
155  ```rust
156  pub fn store(&self, data: &[u8]) -> Result<[u8; 32], StoreError>
157  pub fn fetch(&self, cid: &[u8; 32]) -> Result<Option<Vec<u8>>, StoreError>
158  ```
159  
160  **Invariant**: `cid == blake3::hash(data)` — verified on retrieval.
161  
162  ---
163  
164  ## Routing System
165  
166  ### Tree Coordinates (`abzu-router/src/coords.rs`)
167  
168  Nodes self-organize into a **spanning tree** (inspired by Yggdrasil). Each node has:
169  
170  - **TreeCoords**: Path from root as sequence of port numbers `[1, 5, 3]`
171  - **Depth**: Distance from root
172  - **Lowest Common Ancestor (LCA)**: Used for path computation
173  
174  ### Path Building (`abzu-router/src/path.rs`)
175  
176  Multi-hop routes are computed via LCA algorithm:
177  
178  ```
179  Source: [1,2,3]  →  LCA: [1]  →  Target: [1,5,6]
180  Path: UP(2) → DOWN(5) → DOWN(6)
181  ```
182  
183  ### Routing Table (`abzu-router/src/table.rs`)
184  
185  ```rust
186  pub struct RoutingTable {
187      self_key: PeerKey,
188      peers: HashMap<PeerKey, PeerInfo>,
189      // ...
190  }
191  ```
192  
193  **Routing Strategies**:
194  
195  1. **Tree Routing**: Follow tree coordinates toward destination
196  2. **DHT Fallback**: XOR distance metric when tree path unavailable
197  
198  ---
199  
200  ## Wire Protocol (`abzu-transport/src/wire.rs`)
201  
202  ```rust
203  #[derive(Serialize, Deserialize)]
204  pub enum AbzuFrame {
205      KeepAlive,
206      Chunk { cid: [u8; 32], data: Vec<u8> },
207      Route { target: [u8; 32], next_hop: [u8; 32], payload: Vec<u8> },
208      Request { cid: [u8; 32], requester: [u8; 32] },
209      Chat { id: u64, to: [u8; 32], msg: Vec<u8>, timestamp: u64 },
210      ChatAck { id: u64 },
211      Cover { noise: Vec<u8> },
212      Hello { version_major: u16, version_minor: u16, ephemeral_pub: [u8; 32], timestamp: u64 },
213      HelloAck { version_major: u16, version_minor: u16, ephemeral_pub: [u8; 32], confirmation: Vec<u8> },
214      // Circle frames: CircleCreate, CircleInvite, CircleMessage, CircleAck, CirclePrune, GossipHave
215  }
216  ```
217  
218  **Protocol Version**: 1.0 (negotiated via Hello/HelloAck)
219  
220  **Encoding**: Postcard (varint-compressed, `no_std` compatible)
221  
222  ### Multi-hop Onion Wrapping
223  
224  ```rust
225  // Wrap payload in nested Route layers
226  AbzuFrame::wrap_onion(payload, target, &[hop1, hop2, hop3])
227  ```
228  
229  Each intermediate node sees only `next_hop` and opaque `payload`.
230  
231  ---
232  
233  ## Transport Layer
234  
235  ### FakeTLS (`abzu-transport/src/transports/fake_tls.rs`)
236  
237  Masquerades as TLS 1.3 to evade Deep Packet Inspection:
238  
239  1. **ClientHello**: Valid TLS 1.3 handshake with randomized session ID, SNI rotation
240  2. **Encrypted Framing**: All data wrapped in TLS Application Data records (`0x17 0x03 0x03`)
241  3. **Payload Encryption**: ChaCha20-Poly1305 AEAD
242  
243  ```rust
244  pub struct FakeTlsStream {
245      stream: TcpStream,
246      cipher: ChaCha20Poly1305,
247      shaping: ShapingConfig,
248      // ...
249  }
250  ```
251  
252  ### WebSocket Transport (`abzu-transport/src/transports/websocket.rs`)
253  
254  Alternative transport for environments where WebSocket is more accessible.
255  
256  ### WebRTC Transport (`abzu-transport/src/transports/webrtc.rs`)
257  
258  Peer-to-peer transport using WebRTC data channels:
259  
260  - Browser compatibility via WASM
261  - ICE gathering with configurable STUN/TURN servers
262  - Connection timeouts and retry logic
263  - SDP size limits (8KB) for DoS protection
264  
265  ---
266  
267  ## Tiered Security Model
268  
269  | Tier | Name | Features |
270  |------|------|----------|
271  | **0** | Off | Raw ChaCha20 encryption only |
272  | **1** | Blend | FakeTLS ClientHello + TLS record framing |
273  | **2** | Shadow | + MTU padding (1400 bytes) + timing jitter (50-500ms) |
274  | **3** | Ghost | + Adaptive cover traffic with pattern mimicry |
275  
276  ### Ghost Mode (`abzu-transport/src/cover.rs`)
277  
278  **Architecture**:
279  
280  ```
281  PatternObserver → PatternModel → CoverGenerator
282         ↓               ↓              ↓
283    Record sizes    Store stats    Emit Cover frames
284  ```
285  
286  **Strategy Traits**:
287  
288  - `TimingStrategy` (PoissonTiming): Exponential inter-arrival times
289  - `SizeStrategy` (HistogramSize): Sample from observed size distribution
290  
291  #### Security Hardening (January 2026 Audit)
292  
293  | Vulnerability | Attack | Resolution |
294  |---------------|--------|------------|
295  | Serialization overhead | Size-based DPI filtering | Overhead compensation in `AbzuFrame::cover()` |
296  | Echo attack | Unique probe sizes | 16-byte bucket fuzzing in `PatternModel::record_size()` |
297  
298  **Critical Design**: Cover frames are **cryptographically indistinguishable** on the wire. Discrimination happens **post-decryption** only.
299  
300  ### ML Traffic Generation (`abzu-chameleon`)
301  
302  For adversarial environments with ML-based DPI, Ghost Mode supports neural network-generated traffic:
303  
304  | Component | Purpose |
305  |-----------|---------|
306  | **GRU Model** | 2-layer, 64-unit recurrent network (~54KB weights) |
307  | **Histogram Adapter** | Runtime-observed local network statistics |
308  | **Hybrid Blender** | Combines ML output with histogram, weighted by activity |
309  
310  **Runtime Fine-Tuning Per Node**:
311  
312  Each node maintains its own `AdaptiveHistogram` that observes local traffic patterns:
313  
314  ```rust
315  impl AdaptiveHistogram {
316      pub fn observe(&mut self, delay_ms: u64, size: usize) {
317          // Update local statistics from peer traffic
318          self.timing_bins[delay_bucket(delay_ms)] += 1;
319          self.size_bins[size_bucket(size)] += 1;
320      }
321  }
322  ```
323  
324  The hybrid generator blends pre-trained GRU output with local observations:
325  
326  - **Active traffic**: Higher GRU weight (learned patterns)
327  - **Idle periods**: Higher histogram weight (local ISP characteristics)
328  
329  This allows each node to adapt to its specific network environment while maintaining the temporal coherence that defeats classifier-based detection.
330  
331  ---
332  
333  ## Cryptographic Primitives
334  
335  | Primitive | Use |
336  |-----------|-----|
337  | **Ed25519** | Identity keypairs, signing, address derivation |
338  | **ChaCha20-Poly1305** | AEAD encryption for all traffic |
339  | **BLAKE3** | Content addressing, hashing |
340  
341  ### Address Derivation
342  
343  ```rust
344  // Ed25519 public key → IPv6 in 0200::/7 range
345  Address::from_public_key(&verifying_key)
346  ```
347  
348  ---
349  
350  ## Data Flow
351  
352  ### Outbound Message
353  
354  ```
355  Application
356357358  Node.send_chat(to, msg)
359360361  AbzuFrame::Chat { ... }
362363364  wrap_onion() (if multi-hop)
365366367  FakeTlsStream.send()
368369370  TLS record framing + ChaCha20 encrypt
371372373  TCP → Network
374  ```
375  
376  ### Inbound Message
377  
378  ```
379  TCP ← Network
380381382  FakeTlsStream.recv()
383384385  ChaCha20 decrypt + unwrap TLS record
386387388  AbzuFrame::decode()
389390391  Switchboard.handle_frame()
392393      ├─→ Cover → silent drop
394      ├─→ Route → unwrap/forward
395      ├─→ Chat → decrypt, store, ACK
396      └─→ Chunk → verify CID, store
397  ```
398  
399  ---
400  
401  ## Current Capabilities
402  
403  - ✓ Node lifecycle (create, run, graceful shutdown)
404  - ✓ Peer connections (FakeTLS, WebSocket)
405  - ✓ Full wire protocol (all frame types)
406  - ✓ Content-addressed storage with network discovery
407  - ✓ Encrypted chat with delivery acknowledgments
408  - ✓ Tiered security (Blend, Shadow, Ghost modes)
409  - ✓ Multi-hop routing with onion-wrapped frames
410  - ✓ Ghost mode security hardening (overhead compensation, echo prevention)
411  - ✓ **Multi-Tenant Architecture** (account management, permission model)
412  - ✓ **Kademlia DHT** (peer discovery, value storage)
413  - ✓ **High-Level SDK** (AbzuClient with async API)
414  - ✓ **FFI Layer** (mobile bindings ready)
415  
416  ---
417  
418  ## Honest Limitations
419  
420  | Threat | Reality |
421  |--------|---------|
422  | **ISP disconnection** | No overlay helps if the wire is cut |
423  | **Global traffic analysis** | Timing correlation remains possible |
424  | **Endpoint compromise** | Malware on device defeats encryption |
425  | **BGP manipulation** | Abzu runs *over* the internet |
426  
427  ---
428  
429  ## Security Hardening (January 2026)
430  
431  | Phase | Feature | Status |
432  |-------|---------|--------|
433  | P1 | Link-Layer PFS + X25519 Peer Identity | ✅ Complete |
434  | P2 | Token-based RPC Authentication | ✅ Complete |
435  | P3 | Argon2id + AES-GCM Identity Encryption | ✅ Complete |
436  | P4 | Protocol Version Negotiation | ✅ Complete |
437  | P5 | Content Store GC | ✅ Complete |
438  | P6 | Bootstrap Node Signatures | ✅ Complete |
439  | P7 | Connection Rate Limiting | ✅ Complete |
440  | P8 | DHT Sybil Resistance | ✅ Complete |
441  
442  ---
443  
444  ## Roadmap
445  
446  **Completed**:
447  
448  - ✅ Perfect forward secrecy per session (X25519-based)
449  - ✅ NAT traversal (STUN/TURN with PFS)
450  - ✅ Group messaging (Circles with epoch-based encryption)
451  - ✅ Protocol versioning for safe upgrades
452  - ✅ **Multi-Tenant Architecture** (Phases 1-4 complete)
453  - ✅ **Account Management API** (create, list, delete, switch accounts)
454  - ✅ **SDK Integration** (AbzuClient exposes AccountManager)
455  - ✅ **FFI Error Mapping** (SdkError → AbzuError complete)
456  - ✅ **Identity Encryption at Rest** (Argon2id + AES-256-GCM)
457  - ✅ **Compile-time Feature Gating** (Ghost Mode excludable for mobile)
458  - ✅ **Bootstrap Node Signatures** (Ed25519 challenge-response, TOFU/Pinned modes)
459  - ✅ **Content Store GC** (TTL + size-based eviction, background task)
460  - ✅ **Connection Rate Limiting** (Token bucket, per-IP tracking)
461  - ✅ **DHT Sybil Resistance** (max_peers + per-prefix limits)
462  - ✅ **Home Priority Model** (Edge nodes delegate storage/mailbox to Infrastructure nodes)
463  - ✅ **MailboxRecord DHT Validation** (expiry → sequence → signature, anti-replay)
464  - ✅ **Replication Service** (Sled-backed journal with exponential backoff)
465  
466  **Near-term**:
467  
468  - [ ] Audit log for RPC operations
469  - [ ] Account export/import (portability)
470  
471  **Medium-term**:
472  
473  - [ ] Mobile FFI (iOS/Android apps)
474  - [ ] UDP/QUIC transport
475  
476  **Horizon**:
477  
478  - [ ] LoRa transport (off-grid)
479  - [ ] Mix-network integration
480  - [ ] Threshold cryptography
481  
482  ---
483  
484  ## Design Principles
485  
486  1. **Kerckhoffs' Principle** — Security from keys, not secrecy
487  2. **Pure Logic Routing** — No I/O in routing decisions
488  3. **Transport Agility** — Same logic over TCP, WebSocket, future transports
489  4. **Stealth First** — No protocol magic bytes or distinguishing headers
490  5. **Local First** — Network for discovery/sync, not primary storage
491  
492  ---
493  
494  > *"The system should not depend on secrecy, and it should be possible for it to fall into enemy hands without inconvenience."*  
495  > — Auguste Kerckhoffs, 1883