/ docs / notebooklm / 06_DHT_Design.md
06_DHT_Design.md
  1  # Abzu DHT Design Specification
  2  
  3  > **Component**: `abzu-dht`
  4  > **Role**: Distributed Hash Table & Service Discovery
  5  > **Status**: Design Approved
  6  
  7  ---
  8  
  9  ## 1. Overview
 10  
 11  The Abzu DHT provides a **sovereign, serverless mechanism** for agents and nodes to:
 12  
 13  1. **Publish** their presence and capabilities.
 14  2. **Discover** other agents and services.
 15  3. **Store** small, mutable pointers (Topic -> Peer Data).
 16  4. **Signal** intent (e.g., "I offer LLM inference").
 17  
 18  It replaces the need for centralized trackers or signal servers.
 19  
 20  ---
 21  
 22  ## 2. Kademlia Foundation
 23  
 24  Abzu uses a **Kademlia-based** structure with specific tuning for high-churn P2P environments.
 25  
 26  ### 2.1 Key Parameters
 27  
 28  - **K (Bucket Size)**: 20
 29  - **Alpha (Concurrency)**: 3
 30  - **ID Space**: 256-bit (BLAKE3 hash space)
 31  - **Distance Metric**: XOR
 32  
 33  ### 2.2 Node ID
 34  
 35  Node IDs in the DHT are the **BLAKE3 hash** of the node's Ed25519 public key. This binds the routing layer identity to the storage layer identity.
 36  
 37  ```rust
 38  NodeID = BLAKE3(Ed25519_PublicKey)
 39  ```
 40  
 41  ---
 42  
 43  ## 3. Value Types
 44  
 45  The DHT stores two distinct types of records:
 46  
 47  ### 3.1 Use-Cases
 48  
 49  1. **Peer Discovery**: "Where is Node X?" -> Returns IP/Port.
 50  2. **Topic Subscription**: "Who is interested in 'local-llm'?" -> Returns list of PeerIDs.
 51  3. **Service Advertisement**: "Who provides service 'com.abzu.inference'?" -> Returns Service Descriptors.
 52  
 53  ### 3.2 Data Structure
 54  
 55  Values are signed envelopes:
 56  
 57  ```rust
 58  struct DhtRecord {
 59      key: [u8; 32],          // Topic or InfoHash
 60      value: Vec<u8>,         // Serialized payload
 61      publisher: [u8; 32],    // Signer's Public Key
 62      signature: [u8; 64],    // Ed25519 Signature
 63      nonce: u64,             // Version/Timestamp (monotonic)
 64      ttl: u32,               // Time-to-live in seconds
 65  }
 66  ```
 67  
 68  ---
 69  
 70  ## 4. Anti-Sybil & Security
 71  
 72  ### 4.1 ID Generation Hardening
 73  
 74  To prevent Sybil attacks (where one attacker spins up 1,000,000 nodes), we enforce **Proof of Work** or **Token Bonding** for writable IDs in the future. For v1, we rely on the cost of Ed25519 key generation and **IP limiting**.
 75  
 76  **Rule**: A single IP address can only announce 10 distinct NodeIDs per 10 minutes.
 77  
 78  ### 4.2 Signature Verification
 79  
 80  Every `put` request carries a signature. Nodes **MUST** verify `Ed25519(value + nonce, signature, publisher_key)` before storing or relaying.
 81  
 82  ### 4.3 Stale Data Pruning
 83  
 84  - **TTL**: Records expire after 24 hours.
 85  - **Refresh**: Publishers must republish every 20 hours.
 86  
 87  ---
 88  
 89  ## 5. Routing Logic
 90  
 91  ### 5.1 Lookup (Iterative)
 92  
 93  1. Node A calculates distance `XOR(Target, Self)`.
 94  2. Selects `alpha` closest nodes from local routing table.
 95  3. Sends `FIND_NODE` to them.
 96  4. Update list of closest nodes based on responses.
 97  5. Repeat until no closer nodes found.
 98  
 99  ### 5.2 Storage (Put)
100  
101  1. Perform Lookup to find `K` closest nodes to the `Key`.
102  2. Send `STORE` to those `K` nodes.
103  3. Wait for `W` (Write Quorum) acknowledgments.
104  
105  ---
106  
107  ## 6. API Surface
108  
109  The `abzu-dht` crate exposes:
110  
111  ```rust
112  pub trait DhtStore {
113      async fn announce(topic: [u8; 32], port: u16) -> Result<()>;
114      async fn find_peers(topic: [u8; 32]) -> Result<Vec<PeerInfo>>;
115      async fn put(key: [u8; 32], value: Vec<u8>) -> Result<()>;
116      async fn get(key: [u8; 32]) -> Result<Option<Vec<u8>>>;
117  }
118  ```
119  
120  ---
121  
122  ## 7. Integration with Marketplace
123  
124  The DHT is the backbone of the decentralized marketplace (Drops).
125  
126  1. **Service Listing**: Providers `put` their service info under a known Topic Hash (e.g., `hash("service:llm")`).
127  2. **Discovery**: Consumers `get` that hash to find providers.
128  3. **Negotiation**: Consumers connect directly to providers via the IP returned by the DHT.
129  
130  ---
131  
132  ## 8. Future Extensions
133  
134  - **S/Kademlia**: For stronger Sybil resistance.
135  - **Signed Routing Table**: To prevent eclipse attacks.
136  - **Encrypted Values**: For private signaling.