/ docs / v0.2.md
v0.2.md
  1  # Oxyjen v0.2 Documentation
  2  
  3  Oxyjen v0.2 introduces the **first stable execution layer** of the framework.
  4  
  5  The focus of this release is correctness, determinism, and clean boundaries:
  6  - Memory-aware graph nodes
  7  - Deterministic retry & fallback execution
  8  - Clear separation between policy and enforcement
  9  - A minimal but stable public LLM API
 10  
 11  This document describes **what v0.2 adds**, how the pieces fit together, and how to use them.
 12  
 13  ---
 14  
 15  ## Table of Contents
 16  1. [What's New](#whats-new-in-v02)
 17  2. [What's NOT Included](#explicitly-not-included)
 18  3. [Core Concepts](#memory--nodecontext-core-concept)
 19  4. [Public LLM API](#public-llm-api)
 20  5. [LLMNode](#llmnode-graph-primitive)
 21  6. [LLMChain](#llmchain-retry--fallback-execution)
 22  7. [Exception Model](#exception-model-execution-semantics)
 23  8. [OpenAI Transport](#transportopenai)
 24  9. [Stability Guarantees](#stability-guarantees)
 25  
 26  ---
 27  
 28  ## What's New in v0.2
 29  
 30  ### Core additions
 31  - **`Memory`** abstraction with ordered history
 32  - **`NodeContext`** for stateful graph execution
 33  - **`LLMNode`** as the primary graph primitive
 34  - **`LLMChain`** for retry + fallback execution
 35  - **Public `LLM` API** (`of`, `profile`, `chain`)
 36  - **Explicit exception taxonomy** for deterministic retry behavior
 37  - **OpenAI transport layer** (`transport/openai`)
 38  
 39  ---
 40  
 41  ## Explicitly NOT Included
 42  
 43  The following features are **intentionally deferred** to v0.3+:
 44  
 45  ### Timeout Enforcement
 46  - Timeout is **policy only** in v0.2
 47  - `LLMChain.builder().timeout(Duration.ofSeconds(5))` sets intent but **does not enforce**
 48  - No timeout exceptions thrown
 49  - No retries triggered by timeout
 50  - **Enforcement planned for v0.3**
 51  
 52  ### Model Registry
 53  - No centralized model registry in v0.2
 54  - Model validation happens at runtime via OpenAI API
 55  - Unknown models fail on first API call
 56  - **Planned for v0.3**: `Models.register()` and pre-validation
 57  
 58  ### Conversation Replay
 59  - Memory stores full conversation history
 60  - **BUT**: LLMs are stateless and don't automatically use history
 61  - Each `chat()` call is independent
 62  - Users must manually build conversation context (see [Memory Limitations](#memory-limitations))
 63  - **Planned for v0.3**: Automatic conversation replay via `ChatMemory`
 64  
 65  ### Additional Features (v0.3+)
 66  - DAG execution
 67  - Concurrency
 68  - Streaming
 69  - Automatic OpenAI smoke tests (manual only)
 70  - Token counting
 71  - Conversation summarization
 72  
 73  ---
 74  
 75  ## Memory & NodeContext (Core Concept)
 76  
 77  ### Memory
 78  
 79  `Memory` is a **state container** with two responsibilities:
 80  1. **Key–value storage** for arbitrary data
 81  2. **Ordered history** of events/messages
 82  
 83  ```java
 84  Memory memory = new InMemoryMemory("chat");
 85  
 86  // Key-value storage
 87  memory.put("count", 42);
 88  int count = memory.get("count", Integer.class);
 89  
 90  // Ordered history
 91  memory.append("user", "hello");
 92  memory.append("assistant", "hi");
 93  
 94  List<MemoryEntry> history = memory.entries();
 95  ```
 96  
 97  **Key properties:**
 98  - Thread-safe
 99  - Ordered history (insertion order preserved)
100  - Type-safe retrieval
101  - History is immutable from the outside
102  
103  ---
104  
105  ### NodeContext
106  
107  `NodeContext` owns memory across executions.
108  
109  ```java
110  NodeContext ctx = new NodeContext();
111  
112  Memory chat = ctx.memory("chat");
113  Memory system = ctx.memory("system");
114  ```
115  
116  **Important guarantees:**
117  - Same memory name → same instance
118  - Different names → isolated memory
119  - Memory survives across multiple node executions
120  - State lives in the context, **not inside nodes**
121  
122  ---
123  
124  ### Memory Limitations (v0.2)
125  
126  **Critical Understanding:**
127  
128  **Memory stores history, but LLMs don't automatically use it.**
129  
130  ```java
131  NodeContext ctx = new NodeContext();
132  LLMNode node = LLMNode.builder()
133      .model("gpt-4o")
134      .memory("chat")
135      .build();
136  
137  // Turn 1
138  String r1 = node.process("My name is Alice", ctx);
139  // Memory now has: [user: "My name is Alice", assistant: "..."]
140  
141  // Turn 2
142  String r2 = node.process("What's my name?", ctx);
143  // MODEL WON'T REMEMBER! (in v0.2)
144  // The LLM receives ONLY "What's my name?" as a fresh prompt
145  ```
146  
147  **Why?**
148  - Memory **stores** the conversation
149  - LLMNode **appends** to memory
150  - But the underlying `ChatModel` is **stateless**
151  - Each `chat()` call is independent
152  
153  **Workaround for v0.2:**
154  Users must manually build conversation context:
155  
156  ```java
157  Memory memory = ctx.memory("chat");
158  StringBuilder prompt = new StringBuilder();
159  
160  for (MemoryEntry entry : memory.entries()) {
161      prompt.append(entry.type()).append(": ")
162            .append(entry.value()).append("\n");
163  }
164  
165  String response = model.chat(prompt.toString());
166  ```
167  
168  **v0.3 Solution:**
169  `ChatMemory` will automatically replay conversation history to the LLM.
170  
171  ---
172  
173  ## Public LLM API
174  
175  ### ChatModel (root abstraction)
176  
177  ```java
178  public interface ChatModel {
179      String chat(String input);
180  }
181  ```
182  
183  Everything in Oxyjen depends **only** on `ChatModel`.
184  
185  This allows:
186  - Real models (OpenAI)
187  - Fake models (tests)
188  - Chains (retry + fallback)
189  - Future providers (Anthropic, local models)
190  
191  ---
192  
193  ### LLM.of(...)
194  
195  Create a model by name:
196  
197  ```java
198  ChatModel model = LLM.of("gpt-4o");
199  String out = model.chat("hello");
200  ```
201  
202  **Validation:**
203  - Unknown model → `IllegalArgumentException`
204  - Null / empty → `IllegalArgumentException`
205  
206  ---
207  
208  ### LLM.profile(...)
209  
210  Profiles map use-cases to models.
211  
212  **Default profiles:**
213  - `fast` → `gpt-4o-mini`
214  - `cheap` → `gpt-3.5-turbo`
215  - `smart` → `gpt-4o`
216  
217  ```java
218  ChatModel model = LLM.profile("fast");
219  ```
220  
221  **Note:** Profiles are runtime configuration, not execution logic.
222  
223  ---
224  
225  ## LLMNode (Graph Primitive)
226  
227  `LLMNode` is where Oxyjen differs from LangChain.
228  
229  **It is:**
230  - Memory-aware
231  - Context-driven
232  - Stateless itself (state lives in `NodeContext`)
233  
234  ```java
235  LLMNode node = LLMNode.builder()
236      .model("gpt-4o")
237      .memory("chat")
238      .build();
239  
240  NodeContext ctx = new NodeContext();
241  String out = node.process("hello", ctx);
242  ```
243  
244  ### What happens internally:
245  
246  1. User input appended to memory
247  2. Model invoked (stateless!)
248  3. Assistant response appended to memory
249  4. Output returned
250  
251  ### Memory after two runs:
252  
253  ```
254  user → hello
255  assistant → echo:hello
256  user → world
257  assistant → echo:world
258  ```
259  
260  **This proves nodes are stateful through context, not internally.** 
261  
262  ---
263  
264  ## LLMChain (Retry + Fallback Execution)
265  
266  `LLMChain` composes multiple `ChatModel`s for resilience.
267  
268  ```java
269  ChatModel chain = LLMChain.builder()
270      .primary("gpt-4o")
271      .fallback("gpt-3.5-turbo")
272      .retry(3)
273      .build();
274  ```
275  
276  ### Execution guarantees:
277  
278  1. **Retries happen per model** (3 attempts on `gpt-4o`, then 3 on `gpt-3.5-turbo`)
279  2. **Fallback occurs after retries are exhausted**
280  3. **First successful response short-circuits execution**
281  4. **Final failure throws `LLMException`**
282  
283  ---
284  
285  ### Retry Semantics
286  
287  Retries happen **only** for transient errors:
288  
289  | Error Type | Retryable? |
290  |------------|-----------|
291  | `RateLimitException` | Yes |
292  | `NetworkException` | Yes |
293  | `TimeoutException` (v0.3) | Yes |
294  | `InvalidAPIKeyException` | No |
295  | `ModelNotFoundException` | No |
296  | `TokenLimitExceededException` | No |
297  
298  ---
299  
300  ### Backoff
301  
302  Two strategies:
303  
304  1. **Fixed backoff**: 1s between retries
305  2. **Exponential backoff**: 1s, 2s, 4s, 8s...
306  
307  ```java
308  LLMChain.builder()
309      .primary("gpt-4o")
310      .retry(3)
311      .exponentialBackoff()   // or fixedBackoff()
312      .build();
313  ```
314  
315  **Note:** Backoff affects retry **delay** only, not execution order.
316  
317  ---
318  
319  ### Timeout (v0.2 - Policy Only)
320  
321  ```java
322  LLMChain.builder()
323      .timeout(Duration.ofSeconds(5))
324  ```
325  
326  **In v0.2:**
327  - Timeout is **policy only**
328  - ❌ No enforcement
329  - ❌ No exceptions
330  - ❌ No retries triggered
331  
332  **Enforcement is planned for v0.3.**
333  
334  ---
335  
336  ## Exception Model (Execution Semantics)
337  
338  Oxyjen uses **explicit exception types** to drive execution:
339  
340  | Exception | Meaning | Retry? |
341  |-----------|---------|--------|
342  | `InvalidAPIKeyException` | Auth failure | No |
343  | `ModelNotFoundException` | Model invalid | No |
344  | `TokenLimitExceededException` | Prompt too large | No |
345  | `RateLimitException` | Transient quota | Yes |
346  | `NetworkException` | Server failure | Yes |
347  | `TimeoutException` (v0.3) | Budget exceeded | Yes |
348  
349  This makes retry & fallback **deterministic and testable**. 
350  
351  ---
352  
353  ## transport/openai
354  
355  ### Models Registry (limited in v0.2)
356  
357  ```java
358  Models.isSupported("gpt-4o"); // true
359  Models.isSupported("foo");    // false
360  ```
361  
362  - A minimal internal model registry (`Models`) exists
363  - Used for validation and error classification
364  - Not user-extensible in v0.2
365  - No dynamic registration
366  - **Planned for v0.3**: `Models.register()` and preflight validation
367  
368  Note: Model metadata is internal in v0.2 and not part of the public API.
369  
370  ---
371  
372  ### OpenAIClient
373  
374  **Responsible for:**
375  1. HTTP request construction
376  2. JSON parsing (minimal, v0.2)
377  3. Error classification
378  
379  **Example error mapping:**
380  - `401` → `InvalidAPIKeyException`
381  - `429` → `RateLimitException`
382  - `5xx` → `NetworkException`
383  - `400` + "context length" → `TokenLimitExceededException`
384  
385  ---
386  
387  ### OpenAIChatModel
388  
389  Wraps `OpenAIClient` behind `ChatModel`.
390  
391  ```java
392  ChatModel model = LLM.openai("gpt-4o", apiKey);
393  String out = model.chat("hello");
394  ```
395  
396  **This allows OpenAI to plug cleanly into:**
397  - `LLMNode`
398  - `LLMChain`
399  - Future graph executors
400  
401  ---
402  
403  ## Stability Guarantees
404  
405  For **v0.2.x**:
406  - Public APIs are frozen
407  - Behavior is deterministic
408  - Exception model is stable
409  - Breaking changes only in v0.3
410  
411  ---
412  
413  ## Quick Start Example
414  
415  ```java
416  import io.oxyjen.core.*;
417  import io.oxyjen.llm.*;
418  
419  public class QuickStart {
420      public static void main(String[] args) {
421          // 1. Build a resilient chain
422          ChatModel chain = LLMChain.builder()
423              .primary("gpt-4o")
424              .fallback("gpt-4o-mini")
425              .retry(3)
426              .build();
427          
428          // 2. Create a graph node
429          LLMNode node = LLMNode.builder()
430              .model(chain)
431              .memory("conversation")
432              .build();
433          
434          // 3. Execute with context
435          NodeContext ctx = new NodeContext();
436          String response = node.process("Explain quantum computing", ctx);
437          
438          System.out.println(response);
439          
440          // 4. Memory persists across calls
441          Memory memory = ctx.memory("conversation");
442          System.out.println("History size: " + memory.entries().size());
443      }
444  }
445  ```
446  
447  ---
448  
449  ## What's Next?
450  
451  ### v0.3 Roadmap
452  - Timeout enforcement
453  - Automatic conversation replay (`ChatMemory`)
454  - Model registry with pre-validation
455  - Token counting and management
456  - Conversation summarization
457  - Streaming support
458  
459  ---
460  
461  ## Feedback
462  
463  Found a bug? Have a feature request?
464  
465  - Open an issue on GitHub
466  - Join the discussion in Discussions
467  - Star the repo if you find Oxyjen useful!
468  
469  ---
470  
471  **Oxyjen v0.2** - Simple. Deterministic. Production-ready graphs for AI.