/ docs / archive / DHT_DESIGN_old.md
DHT_DESIGN_old.md
  1  # Abzu DHT Design Specification
  2  
  3  > **Version**: January 2026  
  4  > **Status**: Approved  
  5  > **Scope**: New `abzu-dht` crate — Kademlia-based distributed hash table
  6  
  7  ---
  8  
  9  ## Executive Summary
 10  
 11  This document specifies `abzu-dht`, a Kademlia-inspired distributed hash table for the Abzu mesh network. The DHT provides:
 12  
 13  1. **Peer Discovery** — Bootstrap into the network without hardcoded peers
 14  2. **Content Routing** — Find which peers have specific content (provider records)
 15  3. **Root Discovery** — Multi-root spanning tree coordination
 16  4. **Transport Signaling** — Rendezvous for WebRTC and future transports
 17  
 18  The design follows Abzu's philosophy: **pure logic, no IO**. The DHT layer returns decisions; the caller handles networking.
 19  
 20  ---
 21  
 22  ## Table of Contents
 23  
 24  1. [Design Principles](#1-design-principles)
 25  2. [Kademlia Fundamentals](#2-kademlia-fundamentals)
 26  3. [Data Structures](#3-data-structures)
 27  4. [Protocol Messages](#4-protocol-messages)
 28  5. [Lookup Algorithm](#5-lookup-algorithm)
 29  6. [Value Types and TTL](#6-value-types-and-ttl)
 30  7. [Integration Points](#7-integration-points)
 31  8. [Security Considerations](#8-security-considerations)
 32  9. [API Surface](#9-api-surface)
 33  10. [Crate Structure](#10-crate-structure)
 34  
 35  ---
 36  
 37  ## 1. Design Principles
 38  
 39  ### 1.1 Pure Logic
 40  
 41  Like `abzu-router`, the DHT crate performs **no IO**:
 42  
 43  ```rust
 44  // ✅ Pure logic - returns what to do
 45  fn find_closest_nodes(&self, target: &NodeId) -> Vec<NodeInfo>
 46  
 47  // ❌ Not in this crate - caller handles networking
 48  async fn send_find_node(peer: &Addr, target: &NodeId) -> Response
 49  ```
 50  
 51  The `Switchboard` or a dedicated DHT coordinator handles actual network operations.
 52  
 53  ### 1.2 Integration with Tree Routing
 54  
 55  The DHT complements—not replaces—tree routing:
 56  
 57  | Scenario | Primary Route | DHT Role |
 58  |----------|---------------|----------|
 59  | Known peer in tree | Tree coordinates (fast) | Not used |
 60  | Unknown peer | N/A | Find peer's address/coords |
 61  | Tree failure | `RouteDecision::DhtFallback` | Provides alternative path |
 62  | Content discovery | N/A | "Who has CID X?" |
 63  
 64  ### 1.3 Unified Key Space
 65  
 66  All DHT operations use the same 256-bit key space:
 67  
 68  - **Node IDs**: BLAKE3 hash of Ed25519 public key
 69  - **Content IDs**: BLAKE3 hash of content (already used in `ContentStore`)
 70  - **Mailbox IDs**: BLAKE3 hash of purpose prefix + pubkey
 71  
 72  ```rust
 73  pub type DhtKey = [u8; 32];
 74  
 75  fn node_id(pubkey: &[u8; 32]) -> DhtKey {
 76      blake3::hash(pubkey).into()
 77  }
 78  
 79  fn content_id(data: &[u8]) -> DhtKey {
 80      blake3::hash(data).into()
 81  }
 82  
 83  fn mailbox_id(prefix: &[u8], pubkey: &[u8; 32]) -> DhtKey {
 84      let mut hasher = blake3::Hasher::new();
 85      hasher.update(prefix);
 86      hasher.update(pubkey);
 87      hasher.finalize().into()
 88  }
 89  ```
 90  
 91  ---
 92  
 93  ## 2. Kademlia Fundamentals
 94  
 95  ### 2.1 XOR Distance
 96  
 97  Distance between two keys is their bitwise XOR, interpreted as an unsigned integer:
 98  
 99  ```rust
100  /// XOR distance between two DHT keys
101  pub fn distance(a: &DhtKey, b: &DhtKey) -> [u8; 32] {
102      let mut result = [0u8; 32];
103      for i in 0..32 {
104          result[i] = a[i] ^ b[i];
105      }
106      result
107  }
108  
109  /// Compare distances (for sorting)
110  pub fn closer(target: &DhtKey, a: &DhtKey, b: &DhtKey) -> std::cmp::Ordering {
111      let da = distance(target, a);
112      let db = distance(target, b);
113      da.cmp(&db)
114  }
115  ```
116  
117  **Properties:**
118  
119  - `distance(a, a) = 0`
120  - `distance(a, b) = distance(b, a)` (symmetric)
121  - `distance(a, b) <= distance(a, c) + distance(c, b)` (triangle inequality)
122  
123  ### 2.2 Leading Zero Bits (Bucket Index)
124  
125  The "bucket" for a peer is determined by the number of leading zero bits in the XOR distance:
126  
127  ```rust
128  /// Get bucket index for a key relative to our node ID
129  /// Higher index = closer to us (more leading zeros in XOR)
130  pub fn bucket_index(our_id: &DhtKey, their_id: &DhtKey) -> usize {
131      let d = distance(our_id, their_id);
132      leading_zeros(&d) as usize
133  }
134  
135  fn leading_zeros(key: &[u8; 32]) -> u32 {
136      for (i, &byte) in key.iter().enumerate() {
137          if byte != 0 {
138              return (i as u32) * 8 + byte.leading_zeros();
139          }
140      }
141      256 // All zeros (same key)
142  }
143  ```
144  
145  For 256-bit keys: buckets 0-255, where bucket 255 contains our closest peers.
146  
147  ### 2.3 Why XOR Works
148  
149  XOR distance creates a binary tree structure:
150  
151  ```text
152                      Root (all keys)
153                     /              \
154              0xxxxxxx          1xxxxxxx
155               /    \              /    \
156          00xxxxxx  01xxxxxx  10xxxxxx  11xxxxxx
157             ...       ...       ...       ...
158  ```
159  
160  Keys in bucket `i` share an `i`-bit prefix with us. This enables:
161  
162  - **O(log n) lookups**: Each hop halves the remaining distance
163  - **Natural load balancing**: Each bucket covers an exponentially smaller key space
164  
165  ---
166  
167  ## 3. Data Structures
168  
169  ### 3.1 Node Information
170  
171  ```rust
172  /// Information about a DHT node
173  #[derive(Debug, Clone, Serialize, Deserialize)]
174  pub struct NodeInfo {
175      /// Node's DHT ID (BLAKE3 of pubkey)
176      pub id: DhtKey,
177      /// Node's Ed25519 public key
178      pub pubkey: [u8; 32],
179      /// Last known address (optional - may be behind NAT)
180      pub address: Option<String>,
181      /// Tree coordinates (if known)
182      pub coords: Option<TreeCoords>,
183      /// Last seen timestamp (milliseconds)
184      pub last_seen: u64,
185  }
186  ```
187  
188  ### 3.2 K-Bucket
189  
190  Each bucket stores up to `K` nodes (default K=20):
191  
192  ```rust
193  /// Maximum nodes per bucket
194  pub const K: usize = 20;
195  
196  /// A single k-bucket
197  #[derive(Debug, Clone, Default)]
198  pub struct KBucket {
199      /// Nodes in this bucket, ordered by last-seen (most recent last)
200      nodes: Vec<NodeInfo>,
201      /// Replacement cache for when bucket is full
202      replacements: Vec<NodeInfo>,
203  }
204  
205  impl KBucket {
206      /// Try to add a node to this bucket
207      pub fn add(&mut self, node: NodeInfo, now_ms: u64) -> AddResult {
208          // If node exists, move to end (most recently seen)
209          if let Some(pos) = self.nodes.iter().position(|n| n.id == node.id) {
210              self.nodes.remove(pos);
211              self.nodes.push(node);
212              return AddResult::Updated;
213          }
214          
215          // If bucket has room, add directly
216          if self.nodes.len() < K {
217              self.nodes.push(node);
218              return AddResult::Added;
219          }
220          
221          // Bucket full - add to replacements, signal to ping oldest
222          self.replacements.push(node);
223          if self.replacements.len() > K {
224              self.replacements.remove(0);
225          }
226          AddResult::PingOldest(self.nodes[0].clone())
227      }
228      
229      /// Replace oldest node if it failed to respond to ping
230      pub fn evict_oldest(&mut self) -> bool {
231          if let Some(replacement) = self.replacements.pop() {
232              self.nodes.remove(0);
233              self.nodes.push(replacement);
234              true
235          } else {
236              false
237          }
238      }
239  }
240  
241  pub enum AddResult {
242      Added,
243      Updated,
244      PingOldest(NodeInfo),
245  }
246  ```
247  
248  ### 3.3 Routing Table
249  
250  ```rust
251  /// DHT routing table containing all k-buckets
252  #[derive(Debug)]
253  pub struct RoutingTable {
254      /// Our own node ID
255      self_id: DhtKey,
256      /// Our public key
257      self_pubkey: [u8; 32],
258      /// K-buckets indexed by distance (0 = furthest, 255 = closest)
259      buckets: [KBucket; 256],
260  }
261  
262  impl RoutingTable {
263      /// Find the K closest nodes to a target
264      pub fn find_closest(&self, target: &DhtKey, count: usize) -> Vec<NodeInfo> {
265          // Collect all nodes, sort by distance to target
266          let mut nodes: Vec<_> = self.buckets
267              .iter()
268              .flat_map(|b| b.nodes.iter().cloned())
269              .collect();
270          
271          nodes.sort_by(|a, b| closer(target, &a.id, &b.id));
272          nodes.truncate(count);
273          nodes
274      }
275      
276      /// Get a specific node by ID
277      pub fn get_node(&self, id: &DhtKey) -> Option<&NodeInfo> {
278          let bucket_idx = bucket_index(&self.self_id, id);
279          self.buckets[bucket_idx].nodes.iter().find(|n| &n.id == id)
280      }
281  }
282  ```
283  
284  ### 3.4 Value Store
285  
286  ```rust
287  /// Stored DHT values with TTL
288  #[derive(Debug, Clone, Serialize, Deserialize)]
289  pub struct StoredValue {
290      /// The value data
291      pub data: Vec<u8>,
292      /// Value type (for routing/validation)
293      pub value_type: ValueType,
294      /// Publisher's public key (for verification)
295      pub publisher: [u8; 32],
296      /// Signature over (key || data || expires_at)
297      pub signature: [u8; 64],
298      /// Expiration timestamp (milliseconds)
299      pub expires_at: u64,
300      /// When we stored this locally
301      pub stored_at: u64,
302  }
303  
304  /// Local DHT value store
305  #[derive(Debug, Default)]
306  pub struct ValueStore {
307      /// Values keyed by DHT key
308      values: HashMap<DhtKey, Vec<StoredValue>>,
309      /// Maximum values per key (prevent spam)
310      max_per_key: usize,
311  }
312  ```
313  
314  ---
315  
316  ## 4. Protocol Messages
317  
318  ### 4.1 Wire Frames
319  
320  New `AbzuFrame` variants for DHT operations:
321  
322  ```rust
323  // Add to AbzuFrame enum in wire.rs
324  pub enum AbzuFrame {
325      // ... existing variants ...
326      
327      // DHT Protocol
328      DhtPing { 
329          sender_id: [u8; 32],
330          sender_pubkey: [u8; 32],
331      },
332      DhtPong { 
333          sender_id: [u8; 32],
334          sender_pubkey: [u8; 32],
335      },
336      DhtFindNode { 
337          sender_id: [u8; 32],
338          target: [u8; 32],
339          request_id: u64,
340      },
341      DhtFindNodeReply { 
342          request_id: u64,
343          nodes: Vec<DhtNodeInfo>,  // Serialized NodeInfo list
344      },
345      DhtFindValue { 
346          sender_id: [u8; 32],
347          key: [u8; 32],
348          request_id: u64,
349      },
350      DhtFindValueReply { 
351          request_id: u64,
352          // Either nodes (if we don't have value) or value
353          result: DhtFindValueResult,
354      },
355      DhtStore { 
356          key: [u8; 32],
357          value: Vec<u8>,
358          value_type: u8,
359          publisher: [u8; 32],
360          signature: [u8; 64],
361          ttl_ms: u64,
362      },
363      DhtStoreAck { 
364          key: [u8; 32],
365          stored: bool,
366      },
367  }
368  
369  #[derive(Debug, Clone, Serialize, Deserialize)]
370  pub enum DhtFindValueResult {
371      Nodes(Vec<DhtNodeInfo>),
372      Value(StoredValue),
373  }
374  ```
375  
376  ### 4.2 Request/Response Semantics
377  
378  | Request | Response | Purpose |
379  |---------|----------|---------|
380  | `DhtPing` | `DhtPong` | Liveness check |
381  | `DhtFindNode` | `DhtFindNodeReply` | Get K closest nodes to target |
382  | `DhtFindValue` | `DhtFindValueReply` | Get value or K closest nodes |
383  | `DhtStore` | `DhtStoreAck` | Store value at key |
384  
385  ### 4.3 Message Flow Example (FindNode)
386  
387  ```text
388  Node A wants to find nodes close to Target T
389  
390  1. A.routing_table.find_closest(T, α) → [B, C, D]  (α = 3 parallel queries)
391  
392  2. A → B: DhtFindNode { target: T }
393     A → C: DhtFindNode { target: T }
394     A → D: DhtFindNode { target: T }
395  
396  3. B → A: DhtFindNodeReply { nodes: [E, F] }
397     C → A: DhtFindNodeReply { nodes: [E, G] }
398     D → A: DhtFindNodeReply { nodes: [H] }
399  
400  4. A merges results, queries new closest nodes not yet queried
401  
402  5. Repeat until no closer nodes found or K nodes responded
403  ```
404  
405  ---
406  
407  ## 5. Lookup Algorithm
408  
409  ### 5.1 Iterative Node Lookup
410  
411  ```rust
412  /// Lookup state machine (pure logic)
413  pub struct NodeLookup {
414      /// Target we're looking for
415      target: DhtKey,
416      /// Nodes we've contacted
417      queried: HashSet<DhtKey>,
418      /// Nodes we've heard about but not contacted
419      pending: BinaryHeap<NodeByDistance>,
420      /// Responses received
421      responses: Vec<NodeInfo>,
422      /// Parallel query count (α)
423      alpha: usize,
424  }
425  
426  impl NodeLookup {
427      /// Initialize lookup with seed nodes from routing table
428      pub fn new(target: DhtKey, seed_nodes: Vec<NodeInfo>, alpha: usize) -> Self {
429          let mut pending = BinaryHeap::new();
430          for node in seed_nodes {
431              pending.push(NodeByDistance { 
432                  node, 
433                  distance: distance(&target, &node.id) 
434              });
435          }
436          Self { target, queried: HashSet::new(), pending, responses: Vec::new(), alpha }
437      }
438      
439      /// Get next nodes to query (up to α)
440      pub fn next_queries(&mut self) -> Vec<NodeInfo> {
441          let mut queries = Vec::new();
442          while queries.len() < self.alpha {
443              match self.pending.pop() {
444                  Some(entry) if !self.queried.contains(&entry.node.id) => {
445                      self.queried.insert(entry.node.id);
446                      queries.push(entry.node);
447                  }
448                  None => break,
449                  _ => continue,
450              }
451          }
452          queries
453      }
454      
455      /// Process a response
456      pub fn handle_response(&mut self, from: &DhtKey, nodes: Vec<NodeInfo>) {
457          for node in nodes {
458              if !self.queried.contains(&node.id) {
459                  self.pending.push(NodeByDistance {
460                      node: node.clone(),
461                      distance: distance(&self.target, &node.id),
462                  });
463              }
464              self.responses.push(node);
465          }
466      }
467      
468      /// Check if lookup is complete
469      pub fn is_complete(&self) -> bool {
470          self.pending.is_empty() || self.responses.len() >= K
471      }
472      
473      /// Get final K closest nodes
474      pub fn result(&self) -> Vec<NodeInfo> {
475          let mut result = self.responses.clone();
476          result.sort_by(|a, b| closer(&self.target, &a.id, &b.id));
477          result.truncate(K);
478          result
479      }
480  }
481  ```
482  
483  ### 5.2 Value Lookup
484  
485  Same algorithm, but:
486  
487  1. If any node returns the value → done
488  2. Otherwise, continue until K closest queried
489  3. Return value if found, else closest nodes (for store)
490  
491  ---
492  
493  ## 6. Value Types and TTL
494  
495  ### 6.1 Value Types
496  
497  ```rust
498  #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
499  #[repr(u8)]
500  pub enum ValueType {
501      /// Peer presence announcement
502      PeerAnnounce = 0,
503      /// Content provider record ("I have CID X")
504      ContentProvider = 1,
505      /// Root announcement (spanning tree root)
506      RootAnnounce = 2,
507      /// Transport signaling (WebRTC offer, etc.)
508      SignalOffer = 3,
509      /// Transport signaling response
510      SignalAnswer = 4,
511      /// Mailbox delegation to Home Node
512      MailboxDelegate = 6,
513      /// Token balance record
514      TokenBalance = 10,
515      /// Reputation bond
516      TokenBond = 11,
517      /// Service marketplace offer
518      ServiceOffer = 12,
519      /// Generic application data
520      AppData = 255,
521  }
522  ```
523  
524  ### 6.2 TTL Semantics
525  
526  | Value Type | Default TTL | Max TTL | Republish Interval |
527  |------------|-------------|---------|-------------------|
528  | `PeerAnnounce` | 1 hour | 24 hours | 30 minutes |
529  | `ContentProvider` | 24 hours | 7 days | 12 hours |
530  | `RootAnnounce` | 10 minutes | 1 hour | 5 minutes |
531  | `SignalOffer` | 2 minutes | 5 minutes | N/A (one-shot) |
532  | `SignalAnswer` | 2 minutes | 5 minutes | N/A (one-shot) |
533  | `MailboxDelegate` | 1 hour | 24 hours | 30 minutes |
534  | `AppData` | 1 hour | 24 hours | Caller decides |
535  
536  ### 6.3 Signature Verification
537  
538  All stored values must be signed by the publisher:
539  
540  ```rust
541  fn verify_stored_value(key: &DhtKey, value: &StoredValue) -> bool {
542      // Reconstruct signed message
543      let mut msg = Vec::new();
544      msg.extend_from_slice(key);
545      msg.extend_from_slice(&value.data);
546      msg.extend_from_slice(&value.expires_at.to_be_bytes());
547      
548      // Verify Ed25519 signature
549      let pubkey = ed25519_dalek::VerifyingKey::from_bytes(&value.publisher)
550          .expect("invalid pubkey");
551      let signature = ed25519_dalek::Signature::from_bytes(&value.signature);
552      
553      pubkey.verify_strict(&msg, &signature).is_ok()
554  }
555  ```
556  
557  ---
558  
559  ## 7. Integration Points
560  
561  ### 7.1 Peer Discovery (Bootstrap)
562  
563  **Problem**: New node needs to find peers without hardcoded bootstrap servers.
564  
565  **Solution**: Query DHT for nodes close to our own ID.
566  
567  ```rust
568  // Bootstrap sequence
569  fn bootstrap(dht: &mut DhtState, seed_peer: &NodeInfo) {
570      // 1. Add seed peer to routing table
571      dht.routing_table.add(seed_peer.clone());
572      
573      // 2. Find nodes close to ourselves
574      let lookup = NodeLookup::new(dht.self_id, vec![seed_peer.clone()], 3);
575      
576      // 3. Query and populate routing table
577      // (Switchboard executes the queries)
578  }
579  ```
580  
581  ### 7.2 Content Routing (Provider Records)
582  
583  **Problem**: Node A wants content with CID X. Who has it?
584  
585  **Solution**: Store provider records at `content_id(X)`.
586  
587  ```rust
588  // Announce we have content
589  fn announce_content(dht: &DhtState, cid: &[u8; 32]) -> DhtStore {
590      DhtStore {
591          key: *cid,
592          value: dht.self_pubkey.to_vec(),  // Our pubkey
593          value_type: ValueType::ContentProvider as u8,
594          publisher: dht.self_pubkey,
595          signature: sign(&cid, &dht.self_pubkey, ttl),
596          ttl_ms: 24 * 60 * 60 * 1000,  // 24 hours
597      }
598  }
599  
600  // Find content providers
601  fn find_providers(cid: &[u8; 32]) -> Vec<NodeInfo> {
602      // DhtFindValue for CID → returns providers who stored records
603  }
604  ```
605  
606  ### 7.3 Root Discovery (Multi-Root Coordination)
607  
608  **Problem**: Nodes need to discover spanning tree roots.
609  
610  **Solution**: Roots periodically announce at well-known key.
611  
612  ```rust
613  const ROOT_ANNOUNCE_KEY: &[u8] = b"abzu-roots-v1";
614  
615  fn root_announce_key() -> DhtKey {
616      blake3::hash(ROOT_ANNOUNCE_KEY).into()
617  }
618  
619  fn announce_as_root(dht: &DhtState, root_info: &RootInfo) -> DhtStore {
620      DhtStore {
621          key: root_announce_key(),
622          value: postcard::to_stdvec(root_info).unwrap(),
623          value_type: ValueType::RootAnnounce as u8,
624          publisher: dht.self_pubkey,
625          signature: sign(...),
626          ttl_ms: 10 * 60 * 1000,  // 10 minutes
627      }
628  }
629  ```
630  
631  ### 7.4 Transport Signaling (WebRTC Validation Case)
632  
633  **Problem**: Node A wants to establish WebRTC connection to B. They have no existing connection.
634  
635  **Solution**: A stores offer at B's mailbox key; B polls mailbox.
636  
637  ```rust
638  /// Generate mailbox key for WebRTC offers to a peer
639  fn webrtc_inbox_key(target_pubkey: &[u8; 32]) -> DhtKey {
640      mailbox_id(b"abzu-webrtc-inbox", target_pubkey)
641  }
642  
643  /// Store WebRTC offer for target
644  fn store_webrtc_offer(
645      dht: &DhtState,
646      target_pubkey: &[u8; 32],
647      sdp_offer: &[u8],
648  ) -> DhtStore {
649      let key = webrtc_inbox_key(target_pubkey);
650      
651      DhtStore {
652          key,
653          value: sdp_offer.to_vec(),
654          value_type: ValueType::SignalOffer as u8,
655          publisher: dht.self_pubkey,
656          signature: sign(&key, sdp_offer, ttl),
657          ttl_ms: 2 * 60 * 1000,  // 2 minutes
658      }
659  }
660  
661  /// Check for pending WebRTC offers
662  fn check_webrtc_inbox(dht: &DhtState) -> DhtFindValue {
663      DhtFindValue {
664          sender_id: dht.self_id,
665          key: webrtc_inbox_key(&dht.self_pubkey),
666          request_id: generate_request_id(),
667      }
668  }
669  ```
670  
671  **Flow:**
672  
673  ```text
674  ┌─────────┐                    ┌─────────┐                    ┌─────────┐
675  │ Node A  │                    │   DHT   │                    │ Node B  │
676  └────┬────┘                    └────┬────┘                    └────┬────┘
677       │                              │                              │
678       │ 1. Store offer at B's inbox  │                              │
679       │ ────────────────────────────▶│                              │
680       │                              │                              │
681       │                              │  2. B polls inbox (periodic)  │
682       │                              │◀─────────────────────────────│
683       │                              │                              │
684       │                              │  3. Return A's offer         │
685       │                              │─────────────────────────────▶│
686       │                              │                              │
687       │ 4. B stores answer at A's inbox                             │
688       │◀─────────────────────────────────────────────────────────────
689       │                              │                              │
690       │ 5. A polls, retrieves answer │                              │
691       │                              │                              │
692       ▼                              ▼                              ▼
693     ┌────────────────────────────────────────────────────────────────┐
694     │           6. ICE completes, WebRTC DataChannel ready           │
695     └────────────────────────────────────────────────────────────────┘
696  ```
697  
698  **Why This Validates the DHT API:**
699  
700  1. **Key derivation**: `mailbox_id()` works for any prefix + pubkey
701  2. **TTL**: Short-lived values (2 min) clean up automatically
702  3. **Signature verification**: Offers are authenticated
703  4. **Async pattern**: Sender stores, receiver polls — no direct connection needed
704  
705  If the DHT can handle WebRTC signaling cleanly, it can handle all other use cases.
706  
707  ---
708  
709  ## 8. Security Considerations
710  
711  ### 8.1 Sybil Attack
712  
713  **Threat**: Attacker creates many node IDs to dominate k-buckets.
714  
715  **Multi-Layer Defense (Implemented):**
716  
717  | Layer | Component | Protection |
718  |-------|-----------|------------|
719  | **Age Threshold** | `is_routing_eligible()` | Nodes < 10 min old excluded from routing |
720  | **Trust Scoring** | `routing_score()` | Prioritizes reliable nodes by age/RTT/failures |
721  | **Subnet Diversity** | `subnet_key()` | Max 3 nodes per /24 subnet |
722  | **PoW Threshold** | `SybilDefense` | Optional leading-zero-bit difficulty on IDs |
723  
724  #### Configuration
725  
726  ```rust
727  pub struct SybilConfig {
728      pub min_routing_age_ms: u64,   // Default: 600_000 (10 min)
729      pub max_per_subnet: usize,      // Default: 3
730      pub min_id_difficulty: u8,      // Default: 0 (disabled)
731  }
732  
733  // Presets
734  let config = SybilConfig::permissive();  // 5 min, 5/subnet, no PoW
735  let config = SybilConfig::strict();      // 30 min, 2/subnet, 4-bit PoW
736  ```
737  
738  #### Filtering Method
739  
740  ```rust
741  impl RoutingTable {
742      /// Sybil-aware closest nodes query
743      pub fn closest_eligible(
744          &self,
745          target: &DhtKey,
746          count: usize,
747          now: u64,
748          min_age_ms: u64,
749          max_per_subnet: usize,
750      ) -> ClosestNodes {
751          // 1. Get expanded candidate set
752          // 2. Filter by age threshold
753          // 3. Sort by routing_score (trust-based)
754          // 4. Apply subnet diversity limit
755          // 5. Return top N
756      }
757  }
758  ```
759  
760  #### Trust Score Formula
761  
762  | Factor | Score |
763  |--------|-------|
764  | State: Good | +1000 |
765  | State: Questionable | +500 |
766  | State: Bad | +100 |
767  | Age bonus | +1 per min (max 1440) |
768  | RTT < 100ms | +50 |
769  | RTT < 300ms | +25 |
770  | Per failure | -50 |
771  
772  ### 8.2 Eclipse Attack
773  
774  **Threat**: Attacker surrounds target node with malicious peers.
775  
776  **Mitigations:**
777  
778  1. **Bucket diversity** — ensure nodes come from different network prefixes
779  2. **Seed diversity** — bootstrap from multiple independent sources
780  3. **Refresh buckets** — periodically lookup random IDs in sparse buckets
781  
782  ### 8.3 Value Spam
783  
784  **Threat**: Attacker stores garbage values to consume storage.
785  
786  **Mitigations:**
787  
788  1. **Signature verification** — all values must be signed
789  2. **Per-key limits** — max N values per DHT key
790  3. **Storage quotas** — max total storage per node
791  4. **Value type validation** — reject malformed values
792  
793  ```rust
794  impl ValueStore {
795      fn store(&mut self, key: &DhtKey, value: StoredValue) -> Result<(), StoreError> {
796          // Verify signature
797          if !verify_stored_value(key, &value) {
798              return Err(StoreError::InvalidSignature);
799          }
800          
801          // Check TTL bounds
802          if value.expires_at > now_ms() + MAX_TTL_MS {
803              return Err(StoreError::TtlTooLong);
804          }
805          
806          // Check per-key limit
807          let values = self.values.entry(*key).or_default();
808          if values.len() >= self.max_per_key {
809              // Evict oldest
810              values.sort_by_key(|v| v.stored_at);
811              values.remove(0);
812          }
813          
814          values.push(value);
815          Ok(())
816      }
817  }
818  ```
819  
820  ### 8.4 MailboxRecord Validation (Implemented)
821  
822  `MailboxDelegate` values require additional validation to prevent mailbox hijacking:
823  
824  **Validation Order (cheap-to-expensive):**
825  
826  ```rust
827  fn validate_mailbox_record(&self, value: &StoredValue, existing: Option<&StoredValue>, now: u64) 
828      -> Result<(), StoreError> 
829  {
830      // Deserialize inner record
831      let record: MailboxRecord = bincode::deserialize(&value.payload)?;
832      
833      // 1. Expiry check (cheapest)
834      if record.is_expired(now) {
835          return Err(StoreError::Expired);
836      }
837      
838      // 2. Sequence check (anti-replay)
839      if let Some(existing) = existing {
840          let existing_record: MailboxRecord = bincode::deserialize(&existing.payload)?;
841          if record.sequence <= existing_record.sequence {
842              return Err(StoreError::StaleSequence { 
843                  new: record.sequence, 
844                  existing: existing_record.sequence 
845              });
846          }
847      }
848      
849      // 3. Signature verification (expensive - last)
850      let verifying_key = VerifyingKey::from_bytes(&value.metadata.publisher)?;
851      let signature = Signature::from_bytes(&record.signature);
852      verifying_key.verify(&record.signable_bytes(), &signature)?;
853      
854      Ok(())
855  }
856  ```
857  
858  **MailboxRecord Structure:**
859  
860  ```rust
861  pub struct MailboxRecord {
862      pub delegate_peer_id: [u8; 32],   // Home Node identity
863      pub delegate_addrs: Vec<String>,   // Reachable addresses
864      pub not_valid_after: u64,          // Internal expiry
865      pub sequence: u64,                  // Monotonic counter
866      pub signature: [u8; 64],           // Ed25519 over above fields
867  }
868  ```
869  
870  **Security Guarantees:**
871  
872  | Attack | Mitigation |
873  |--------|------------|
874  | Stale replay | `sequence` must be > existing |
875  | Mailbox hijack | Signature must match publisher (owner) |
876  | Expired records | `not_valid_after` checked first |
877  
878  ### 8.5 Routing Table Poisoning
879  
880  **Threat**: Malicious nodes return fake node lists.
881  
882  **Mitigations:**
883  
884  1. **Verify node IDs** — pubkey must hash to claimed ID
885  2. **Ping before trust** — don't route through unverified nodes
886  3. **Prefer responsive nodes** — evict nodes that don't respond
887  
888  ---
889  
890  ## 9. API Surface
891  
892  ### 9.1 Core Types
893  
894  ```rust
895  // Re-exports from abzu-dht
896  pub use routing::{RoutingTable, KBucket, NodeInfo, DhtKey};
897  pub use store::{ValueStore, StoredValue, ValueType};
898  pub use lookup::{NodeLookup, ValueLookup};
899  pub use protocol::{DhtRequest, DhtResponse};
900  ```
901  
902  ### 9.2 RoutingTable API
903  
904  | Method | Description |
905  |--------|-------------|
906  | `new(self_id, self_pubkey)` | Create routing table |
907  | `add(node)` | Add node, returns `AddResult` |
908  | `remove(id)` | Remove node |
909  | `get(id)` | Get node by ID |
910  | `find_closest(target, k)` | Find K closest to target |
911  | `random_id_in_bucket(index)` | For bucket refresh |
912  | `stats()` | Bucket fill statistics |
913  
914  ### 9.3 ValueStore API
915  
916  | Method | Description |
917  |--------|-------------|
918  | `new(config)` | Create store with limits |
919  | `store(key, value)` | Store value (validates) |
920  | `get(key)` | Get values for key |
921  | `remove_expired(now)` | Prune expired values |
922  | `stats()` | Storage statistics |
923  
924  ### 9.4 Lookup State Machines
925  
926  | Type | Purpose |
927  |------|---------|
928  | `NodeLookup` | Find K closest nodes |
929  | `ValueLookup` | Find value or K closest |
930  | `StoreLookup` | Store value at K closest |
931  
932  ---
933  
934  ## 10. Crate Structure
935  
936  ```text
937  abzu-dht/
938  ├── Cargo.toml
939  └── src/
940      ├── lib.rs              # Public API, re-exports
941      ├── key.rs              # DhtKey, distance, bucket_index
942      ├── routing/
943      │   ├── mod.rs
944      │   ├── bucket.rs       # KBucket implementation
945      │   └── table.rs        # RoutingTable
946      ├── store/
947      │   ├── mod.rs
948      │   ├── value.rs        # StoredValue, ValueType
949      │   └── store.rs        # ValueStore
950      ├── lookup/
951      │   ├── mod.rs
952      │   ├── node.rs         # NodeLookup state machine
953      │   └── value.rs        # ValueLookup state machine
954      ├── protocol.rs         # Message types (for wire.rs integration)
955      └── security.rs         # Signature verification, spam protection
956  ```
957  
958  ---
959  
960  ## Appendix A: Integration Checklist
961  
962  - [ ] Add `abzu-dht` crate to workspace
963  - [ ] Add DHT frames to `AbzuFrame` in `wire.rs`
964  - [ ] Add DHT handlers to `Switchboard`
965  - [ ] Implement `DhtCoordinator` for async operations
966  - [ ] Connect to `RoutingTable.route_by_key_distance()` fallback
967  - [ ] Add DHT bootstrap to `abzu-core/src/bootstrap.rs`
968  - [ ] Add content provider records to `ContentStore` operations
969  - [ ] Integrate with `MultiRootCoordinator` for root discovery
970  
971  ---
972  
973  ## Appendix B: Open Questions
974  
975  1. **Proof-of-Work**: Should node IDs require PoW? Trade-off: Sybil resistance vs. onboarding friction.
976  
977  2. **Bucket Refresh Rate**: How often to refresh sparse buckets? More frequent = more traffic, less = stale routing.
978  
979  3. **Replication Factor**: Store values at K nodes or configurable R < K?
980  
981  4. **NAT and DHT**: If a node is behind NAT, how do we store its address? Rely on tree coords + relay?
982  
983  ---
984  
985  > *"The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point."* — Claude Shannon