LLM_SESSION_INTEGRATION.md
1 # LLM Session Integration Guide 2 3 **LocalCode-Style Conversation Memory for All ECHO Agents** 4 5 This guide shows how to integrate session-based LLM queries (like LocalCode) into all 9 ECHO agents. 6 7 ## What You Get 8 9 ✅ **Conversation Memory** - Multi-turn conversations with last 5 turns kept 10 ✅ **Automatic Context Injection** - Agent role, recent decisions/messages, system status, git context 11 ✅ **Context Size Warnings** - Alerts when context grows large (>4000 tokens) 12 ✅ **Session Management** - Auto-cleanup after 1 hour of inactivity 13 ✅ **Project-Aware Responses** - LLM knows about ECHO architecture and current state 14 15 ## Architecture 16 17 ``` 18 Agent MCP Tool: session_consult 19 ↓ 20 DecisionHelper.consult_session(role, session_id, question) 21 ↓ 22 Session.query(session_id, question, opts) 23 ├─ ContextBuilder.build_startup_context(role) 24 ├─ Maintains conversation history (last 5 turns) 25 └─ Client.chat(model, messages, opts) 26 ``` 27 28 ## How to Add to Any Agent 29 30 ### Step 1: Add `session_consult` Tool Definition 31 32 In your agent's main module (e.g., `apps/echo_ceo/lib/ceo.ex`): 33 34 ```elixir 35 defmodule CEO do 36 use EchoShared.MCP.Server 37 38 @impl true 39 def tools do 40 [ 41 # ... existing tools ... 42 43 # NEW: Session-based AI consultation with conversation memory 44 %{ 45 name: "session_consult", 46 description: """ 47 Query the AI assistant with conversation memory (LocalCode-style). 48 49 Maintains multi-turn conversations with automatic context injection: 50 - Your role and responsibilities 51 - Recent decisions and messages 52 - System status and git context 53 - Conversation history (last 5 turns) 54 55 Use this for exploratory questions, decision analysis, or iterative thinking. 56 """, 57 inputSchema: %{ 58 type: "object", 59 properties: %{ 60 question: %{ 61 type: "string", 62 description: "The question to ask the AI assistant" 63 }, 64 session_id: %{ 65 type: "string", 66 description: "Session ID to continue conversation (optional, omit for new session)" 67 }, 68 context: %{ 69 type: "string", 70 description: "Additional context for this specific query (optional)" 71 } 72 }, 73 required: ["question"] 74 } 75 } 76 ] 77 end 78 79 @impl true 80 def execute_tool(tool_name, args) do 81 case tool_name do 82 # ... existing tools ... 83 84 "session_consult" -> 85 execute_session_consult(args) 86 87 _ -> 88 {:error, "Unknown tool: #{tool_name}"} 89 end 90 end 91 92 # NEW: Execute session-based consultation 93 defp execute_session_consult(args) do 94 alias EchoShared.LLM.DecisionHelper 95 96 question = Map.fetch!(args, "question") 97 session_id = Map.get(args, "session_id") # nil for new session 98 context = Map.get(args, "context") 99 100 # Build opts 101 opts = if context, do: [context: context], else: [] 102 103 case DecisionHelper.consult_session(agent_role(), session_id, question, opts) do 104 {:ok, result} -> 105 # Format response with warnings 106 response = format_session_response(result) 107 {:ok, response} 108 109 {:error, :llm_disabled} -> 110 {:error, "LLM is disabled for #{agent_role()}. Enable with LLM_ENABLED=true"} 111 112 {:error, :session_not_found} -> 113 {:error, "Session not found: #{session_id}. It may have expired."} 114 115 {:error, reason} -> 116 {:error, "AI consultation failed: #{inspect(reason)}"} 117 end 118 end 119 120 defp format_session_response(result) do 121 base = %{ 122 response: result.response, 123 session_id: result.session_id, 124 turn_count: result.turn_count, 125 estimated_tokens: result.total_tokens, 126 model: EchoShared.LLM.Config.get_model(agent_role()) 127 } 128 129 # Add warnings if any 130 if result.warnings != [] do 131 Map.put(base, :warnings, result.warnings) 132 else 133 base 134 end 135 end 136 137 # Helper to get agent role 138 defp agent_role, do: :ceo # Change per agent: :ceo, :cto, :chro, etc. 139 end 140 ``` 141 142 ### Step 2: Update `agent_role()` Helper 143 144 Make sure each agent returns its correct role: 145 146 ```elixir 147 # CEO agent 148 defp agent_role, do: :ceo 149 150 # CTO agent 151 defp agent_role, do: :cto 152 153 # CHRO agent 154 defp agent_role, do: :chro 155 156 # Operations Head agent 157 defp agent_role, do: :operations_head 158 159 # Product Manager agent 160 defp agent_role, do: :product_manager 161 162 # Senior Architect agent 163 defp agent_role, do: :senior_architect 164 165 # UI/UX Engineer agent 166 defp agent_role, do: :uiux_engineer 167 168 # Senior Developer agent 169 defp agent_role, do: :senior_developer 170 171 # Test Lead agent 172 defp agent_role, do: :test_lead 173 ``` 174 175 ### Step 3: Rebuild Agent 176 177 ```bash 178 cd apps/echo_ceo # Or whichever agent you're updating 179 mix deps.get 180 mix compile 181 mix escript.build 182 ``` 183 184 ## Usage Examples 185 186 ### Example 1: New Session Query 187 188 ```bash 189 # Via MCP client (Claude Desktop) 190 { 191 "tool": "session_consult", 192 "arguments": { 193 "question": "What are my top priorities as CEO this quarter?" 194 } 195 } 196 197 # Response 198 { 199 "response": "As CEO, your top priorities should be:\n1. Strategic planning...", 200 "session_id": "ceo_1699564234_123456", 201 "turn_count": 1, 202 "estimated_tokens": 1876, 203 "model": "llama3.1:8b" 204 } 205 ``` 206 207 ### Example 2: Continue Conversation 208 209 ```bash 210 # Follow-up query using session_id 211 { 212 "tool": "session_consult", 213 "arguments": { 214 "session_id": "ceo_1699564234_123456", 215 "question": "Tell me more about priority #2" 216 } 217 } 218 219 # Response 220 { 221 "response": "Regarding strategic planning, you should focus on...", 222 "session_id": "ceo_1699564234_123456", 223 "turn_count": 2, 224 "estimated_tokens": 2341, 225 "model": "llama3.1:8b" 226 } 227 ``` 228 229 ### Example 3: With Additional Context 230 231 ```bash 232 { 233 "tool": "session_consult", 234 "arguments": { 235 "question": "Should we approve this budget request?", 236 "context": "Budget request: $2.5M for new datacenter. Current cash reserves: $10M." 237 } 238 } 239 ``` 240 241 ### Example 4: Context Warning 242 243 ```bash 244 # After 8-10 turns... 245 { 246 "response": "Based on our previous discussion...", 247 "session_id": "ceo_1699564234_123456", 248 "turn_count": 9, 249 "estimated_tokens": 4523, 250 "model": "llama3.1:8b", 251 "warnings": [ 252 "Session has 9 turns. Consider ending session soon.", 253 "Context size large (4523 tokens). Session approaching limit." 254 ] 255 } 256 ``` 257 258 ## What Gets Injected Automatically 259 260 When you start a session, the LLM automatically receives: 261 262 ### 1. Project Overview (~400 tokens) 263 - ECHO architecture explanation 264 - 9 agent roles 265 - Technology stack (Elixir, PostgreSQL, Redis, MCP) 266 - Decision modes (Autonomous, Collaborative, Hierarchical, Human) 267 268 ### 2. Agent Role Context (~300 tokens) 269 - Your title (e.g., "Chief Executive Officer") 270 - Your responsibilities (e.g., "Strategic leadership", "Budget approvals") 271 - Your authority limits (e.g., "Can approve up to $1M autonomously") 272 - Your key collaborators (e.g., [:cto, :chro, :operations_head]) 273 274 ### 3. System Status (~200 tokens) 275 - PostgreSQL status 276 - Redis status 277 - Ollama status 278 - Active agents count 279 280 ### 4. Recent Activity (~500-800 tokens) 281 - Last 5 decisions you initiated 282 - Last 5 messages to/from you 283 284 ### 5. Git Context (~100 tokens) 285 - Current branch 286 - Last commit 287 288 ### 6. Conversation History (~500-2000 tokens, grows over time) 289 - Last 5 conversation turns 290 - Your questions + AI responses 291 292 **Total startup context:** ~1,500-2,000 tokens 293 **After 5 turns:** ~3,000-4,000 tokens 294 **Warning threshold:** 4,000 tokens 295 **Limit:** 6,000 tokens (session restart recommended) 296 297 ## Session Management 298 299 ### Automatic Cleanup 300 301 Sessions are automatically cleaned up after: 302 - **1 hour of inactivity** (no queries) 303 - Configurable in `apps/echo_shared/config/dev.exs` 304 305 ### Manual Session Control 306 307 ```elixir 308 # List all active sessions 309 EchoShared.LLM.Session.list_sessions() 310 # => [ 311 # %{session_id: "ceo_...", agent_role: :ceo, turn_count: 5, ...}, 312 # %{session_id: "cto_...", agent_role: :cto, turn_count: 2, ...} 313 # ] 314 315 # Get session details 316 EchoShared.LLM.Session.get_session("ceo_1699564234_123456") 317 # => %{session_id: ..., conversation_history: [...], ...} 318 319 # End session manually 320 EchoShared.LLM.Session.end_session("ceo_1699564234_123456") 321 # => {:ok, archived_conversation} 322 ``` 323 324 ## Configuration 325 326 All configuration in `apps/echo_shared/config/dev.exs`: 327 328 ```elixir 329 # LLM Session configuration 330 config :echo_shared, :llm_session, 331 max_turns: 5, # Conversation history depth 332 timeout_ms: 3_600_000, # 1 hour inactivity timeout 333 cleanup_interval_ms: 900_000, # Cleanup every 15 minutes 334 warning_threshold: 4_000, # Warn at 4K tokens 335 limit_threshold: 6_000 # Critical at 6K tokens 336 337 # Agent-specific models 338 config :echo_shared, :agent_models, %{ 339 ceo: "llama3.1:8b", 340 cto: "deepseek-coder:6.7b", 341 chro: "llama3.1:8b", 342 operations_head: "mistral:7b", 343 product_manager: "llama3.1:8b", 344 senior_architect: "deepseek-coder:6.7b", 345 uiux_engineer: "llama3.1:8b", 346 senior_developer: "deepseek-coder:6.7b", 347 test_lead: "deepseek-coder:6.7b" 348 } 349 ``` 350 351 ### Override via Environment Variables 352 353 ```bash 354 # Change CEO's model 355 export CEO_MODEL=qwen2.5:14b 356 357 # Disable LLM for specific agent 358 export CEO_LLM_ENABLED=false 359 360 # Change Ollama endpoint 361 export OLLAMA_ENDPOINT=http://192.168.1.100:11434 362 ``` 363 364 ## Comparison: LocalCode vs Agent Session Integration 365 366 | Feature | LocalCode (Bash) | Agent LLM (Elixir) | 367 |---------|------------------|---------------------| 368 | **Session Management** | ✅ File-based | ✅ ETS-based (in-memory) | 369 | **Context Injection** | ✅ CLAUDE.md + git + status | ✅ Role + decisions + messages + git | 370 | **Conversation Memory** | ✅ Last 5 turns | ✅ Last 5 turns | 371 | **Context Warnings** | ✅ Yes | ✅ Yes | 372 | **Auto-Cleanup** | ❌ Manual (`lc_end`) | ✅ Automatic (1 hour timeout) | 373 | **Tool Simulation** | ✅ Yes (bash) | ❌ No (could add) | 374 | **Model** | deepseek-coder:6.7b | Role-specific (9 models) | 375 | **Response Time** | 7-30 seconds | 7-30 seconds | 376 | **Use Case** | CLI development assistant | Agent decision support | 377 378 ## Testing 379 380 ### Unit Test Example 381 382 ```elixir 383 defmodule CEOTest do 384 use ExUnit.Case 385 alias CEO 386 387 describe "session_consult tool" do 388 test "starts new session and returns response" do 389 args = %{"question" => "What should I prioritize?"} 390 391 assert {:ok, result} = CEO.execute_tool("session_consult", args) 392 assert is_binary(result.response) 393 assert is_binary(result.session_id) 394 assert result.turn_count == 1 395 assert result.model == "llama3.1:8b" 396 end 397 398 test "continues existing session" do 399 # First query 400 {:ok, result1} = CEO.execute_tool("session_consult", %{ 401 "question" => "What's my role?" 402 }) 403 404 # Second query with session_id 405 {:ok, result2} = CEO.execute_tool("session_consult", %{ 406 "session_id" => result1.session_id, 407 "question" => "Tell me more" 408 }) 409 410 assert result2.session_id == result1.session_id 411 assert result2.turn_count == 2 412 end 413 414 test "returns error for invalid session" do 415 args = %{ 416 "session_id" => "invalid_session_123", 417 "question" => "Test" 418 } 419 420 assert {:error, message} = CEO.execute_tool("session_consult", args) 421 assert message =~ "Session not found" 422 end 423 end 424 end 425 ``` 426 427 ### Integration Test 428 429 ```bash 430 # Start agent in autonomous mode 431 cd apps/echo_ceo 432 ./ceo --autonomous & 433 CEO_PID=$! 434 435 # Test session_consult via IEx 436 iex -S mix 437 438 iex> alias EchoShared.LLM.DecisionHelper 439 iex> {:ok, r1} = DecisionHelper.consult_session(:ceo, nil, "What's my role?") 440 iex> IO.puts(r1.response) 441 iex> {:ok, r2} = DecisionHelper.consult_session(:ceo, r1.session_id, "What are my priorities?") 442 iex> IO.puts(r2.response) 443 iex> EchoShared.LLM.Session.end_session(r1.session_id) 444 445 # Cleanup 446 kill $CEO_PID 447 ``` 448 449 ## Troubleshooting 450 451 ### "LLM is disabled" 452 453 ```bash 454 # Enable globally 455 export LLM_ENABLED=true 456 457 # Or per-agent 458 export CEO_LLM_ENABLED=true 459 ``` 460 461 ### "Session not found" 462 463 Sessions expire after 1 hour of inactivity. Start a new session by omitting `session_id`. 464 465 ### "Failed to get response from Ollama" 466 467 ```bash 468 # Check Ollama is running 469 curl http://localhost:11434/api/tags 470 471 # Check model is installed 472 ollama list | grep llama3.1 473 474 # Pull model if missing 475 ollama pull llama3.1:8b 476 ``` 477 478 ### Slow responses (>60 seconds) 479 480 ```bash 481 # Use smaller/faster model 482 export CEO_MODEL=deepseek-coder:1.3b 483 484 # Or increase timeout in client.ex (default: 180 seconds) 485 ``` 486 487 ## Next Steps 488 489 1. **Add to all 9 agents** - Copy the pattern to each agent module 490 2. **Test each agent** - Verify LLM responds correctly for each role 491 3. **Monitor performance** - Track response times and context sizes 492 4. **Optimize prompts** - Refine system prompts in `Config.get_system_prompt/1` 493 5. **Add tool simulation** (optional) - Similar to LocalCode's tool detection 494 495 ## Related Files 496 497 - `apps/echo_shared/lib/echo_shared/llm/session.ex` - Session manager 498 - `apps/echo_shared/lib/echo_shared/llm/context_builder.ex` - Context injection 499 - `apps/echo_shared/lib/echo_shared/llm/decision_helper.ex` - High-level API 500 - `apps/echo_shared/lib/echo_shared/llm/config.ex` - Model configuration 501 - `apps/echo_shared/lib/echo_shared/llm/client.ex` - Ollama HTTP client 502 - `apps/echo_shared/config/dev.exs` - Configuration 503 504 ## Summary 505 506 You now have **LocalCode-style conversation memory** for all ECHO agents! Each agent can: 507 508 ✅ Have multi-turn conversations with project context 509 ✅ Remember last 5 turns automatically 510 ✅ Get warnings when context grows large 511 ✅ Auto-cleanup inactive sessions 512 ✅ Use role-specific specialized models 513 514 **Integration effort:** ~50 lines of code per agent 515 **Benefits:** Project-aware AI assistance for autonomous decision-making 516 **Response quality:** Comparable to LocalCode, specialized per agent role 517 518 Happy integrating! 🚀