/ apps / echo_shared / LLM_SESSION_INTEGRATION.md
LLM_SESSION_INTEGRATION.md
  1  # LLM Session Integration Guide
  2  
  3  **LocalCode-Style Conversation Memory for All ECHO Agents**
  4  
  5  This guide shows how to integrate session-based LLM queries (like LocalCode) into all 9 ECHO agents.
  6  
  7  ## What You Get
  8  
  9  ✅ **Conversation Memory** - Multi-turn conversations with last 5 turns kept
 10  ✅ **Automatic Context Injection** - Agent role, recent decisions/messages, system status, git context
 11  ✅ **Context Size Warnings** - Alerts when context grows large (>4000 tokens)
 12  ✅ **Session Management** - Auto-cleanup after 1 hour of inactivity
 13  ✅ **Project-Aware Responses** - LLM knows about ECHO architecture and current state
 14  
 15  ## Architecture
 16  
 17  ```
 18  Agent MCP Tool: session_consult
 19 20  DecisionHelper.consult_session(role, session_id, question)
 21 22  Session.query(session_id, question, opts)
 23         ├─ ContextBuilder.build_startup_context(role)
 24         ├─ Maintains conversation history (last 5 turns)
 25         └─ Client.chat(model, messages, opts)
 26  ```
 27  
 28  ## How to Add to Any Agent
 29  
 30  ### Step 1: Add `session_consult` Tool Definition
 31  
 32  In your agent's main module (e.g., `apps/echo_ceo/lib/ceo.ex`):
 33  
 34  ```elixir
 35  defmodule CEO do
 36    use EchoShared.MCP.Server
 37  
 38    @impl true
 39    def tools do
 40      [
 41        # ... existing tools ...
 42  
 43        # NEW: Session-based AI consultation with conversation memory
 44        %{
 45          name: "session_consult",
 46          description: """
 47          Query the AI assistant with conversation memory (LocalCode-style).
 48  
 49          Maintains multi-turn conversations with automatic context injection:
 50          - Your role and responsibilities
 51          - Recent decisions and messages
 52          - System status and git context
 53          - Conversation history (last 5 turns)
 54  
 55          Use this for exploratory questions, decision analysis, or iterative thinking.
 56          """,
 57          inputSchema: %{
 58            type: "object",
 59            properties: %{
 60              question: %{
 61                type: "string",
 62                description: "The question to ask the AI assistant"
 63              },
 64              session_id: %{
 65                type: "string",
 66                description: "Session ID to continue conversation (optional, omit for new session)"
 67              },
 68              context: %{
 69                type: "string",
 70                description: "Additional context for this specific query (optional)"
 71              }
 72            },
 73            required: ["question"]
 74          }
 75        }
 76      ]
 77    end
 78  
 79    @impl true
 80    def execute_tool(tool_name, args) do
 81      case tool_name do
 82        # ... existing tools ...
 83  
 84        "session_consult" ->
 85          execute_session_consult(args)
 86  
 87        _ ->
 88          {:error, "Unknown tool: #{tool_name}"}
 89      end
 90    end
 91  
 92    # NEW: Execute session-based consultation
 93    defp execute_session_consult(args) do
 94      alias EchoShared.LLM.DecisionHelper
 95  
 96      question = Map.fetch!(args, "question")
 97      session_id = Map.get(args, "session_id")  # nil for new session
 98      context = Map.get(args, "context")
 99  
100      # Build opts
101      opts = if context, do: [context: context], else: []
102  
103      case DecisionHelper.consult_session(agent_role(), session_id, question, opts) do
104        {:ok, result} ->
105          # Format response with warnings
106          response = format_session_response(result)
107          {:ok, response}
108  
109        {:error, :llm_disabled} ->
110          {:error, "LLM is disabled for #{agent_role()}. Enable with LLM_ENABLED=true"}
111  
112        {:error, :session_not_found} ->
113          {:error, "Session not found: #{session_id}. It may have expired."}
114  
115        {:error, reason} ->
116          {:error, "AI consultation failed: #{inspect(reason)}"}
117      end
118    end
119  
120    defp format_session_response(result) do
121      base = %{
122        response: result.response,
123        session_id: result.session_id,
124        turn_count: result.turn_count,
125        estimated_tokens: result.total_tokens,
126        model: EchoShared.LLM.Config.get_model(agent_role())
127      }
128  
129      # Add warnings if any
130      if result.warnings != [] do
131        Map.put(base, :warnings, result.warnings)
132      else
133        base
134      end
135    end
136  
137    # Helper to get agent role
138    defp agent_role, do: :ceo  # Change per agent: :ceo, :cto, :chro, etc.
139  end
140  ```
141  
142  ### Step 2: Update `agent_role()` Helper
143  
144  Make sure each agent returns its correct role:
145  
146  ```elixir
147  # CEO agent
148  defp agent_role, do: :ceo
149  
150  # CTO agent
151  defp agent_role, do: :cto
152  
153  # CHRO agent
154  defp agent_role, do: :chro
155  
156  # Operations Head agent
157  defp agent_role, do: :operations_head
158  
159  # Product Manager agent
160  defp agent_role, do: :product_manager
161  
162  # Senior Architect agent
163  defp agent_role, do: :senior_architect
164  
165  # UI/UX Engineer agent
166  defp agent_role, do: :uiux_engineer
167  
168  # Senior Developer agent
169  defp agent_role, do: :senior_developer
170  
171  # Test Lead agent
172  defp agent_role, do: :test_lead
173  ```
174  
175  ### Step 3: Rebuild Agent
176  
177  ```bash
178  cd apps/echo_ceo  # Or whichever agent you're updating
179  mix deps.get
180  mix compile
181  mix escript.build
182  ```
183  
184  ## Usage Examples
185  
186  ### Example 1: New Session Query
187  
188  ```bash
189  # Via MCP client (Claude Desktop)
190  {
191    "tool": "session_consult",
192    "arguments": {
193      "question": "What are my top priorities as CEO this quarter?"
194    }
195  }
196  
197  # Response
198  {
199    "response": "As CEO, your top priorities should be:\n1. Strategic planning...",
200    "session_id": "ceo_1699564234_123456",
201    "turn_count": 1,
202    "estimated_tokens": 1876,
203    "model": "llama3.1:8b"
204  }
205  ```
206  
207  ### Example 2: Continue Conversation
208  
209  ```bash
210  # Follow-up query using session_id
211  {
212    "tool": "session_consult",
213    "arguments": {
214      "session_id": "ceo_1699564234_123456",
215      "question": "Tell me more about priority #2"
216    }
217  }
218  
219  # Response
220  {
221    "response": "Regarding strategic planning, you should focus on...",
222    "session_id": "ceo_1699564234_123456",
223    "turn_count": 2,
224    "estimated_tokens": 2341,
225    "model": "llama3.1:8b"
226  }
227  ```
228  
229  ### Example 3: With Additional Context
230  
231  ```bash
232  {
233    "tool": "session_consult",
234    "arguments": {
235      "question": "Should we approve this budget request?",
236      "context": "Budget request: $2.5M for new datacenter. Current cash reserves: $10M."
237    }
238  }
239  ```
240  
241  ### Example 4: Context Warning
242  
243  ```bash
244  # After 8-10 turns...
245  {
246    "response": "Based on our previous discussion...",
247    "session_id": "ceo_1699564234_123456",
248    "turn_count": 9,
249    "estimated_tokens": 4523,
250    "model": "llama3.1:8b",
251    "warnings": [
252      "Session has 9 turns. Consider ending session soon.",
253      "Context size large (4523 tokens). Session approaching limit."
254    ]
255  }
256  ```
257  
258  ## What Gets Injected Automatically
259  
260  When you start a session, the LLM automatically receives:
261  
262  ### 1. Project Overview (~400 tokens)
263  - ECHO architecture explanation
264  - 9 agent roles
265  - Technology stack (Elixir, PostgreSQL, Redis, MCP)
266  - Decision modes (Autonomous, Collaborative, Hierarchical, Human)
267  
268  ### 2. Agent Role Context (~300 tokens)
269  - Your title (e.g., "Chief Executive Officer")
270  - Your responsibilities (e.g., "Strategic leadership", "Budget approvals")
271  - Your authority limits (e.g., "Can approve up to $1M autonomously")
272  - Your key collaborators (e.g., [:cto, :chro, :operations_head])
273  
274  ### 3. System Status (~200 tokens)
275  - PostgreSQL status
276  - Redis status
277  - Ollama status
278  - Active agents count
279  
280  ### 4. Recent Activity (~500-800 tokens)
281  - Last 5 decisions you initiated
282  - Last 5 messages to/from you
283  
284  ### 5. Git Context (~100 tokens)
285  - Current branch
286  - Last commit
287  
288  ### 6. Conversation History (~500-2000 tokens, grows over time)
289  - Last 5 conversation turns
290  - Your questions + AI responses
291  
292  **Total startup context:** ~1,500-2,000 tokens
293  **After 5 turns:** ~3,000-4,000 tokens
294  **Warning threshold:** 4,000 tokens
295  **Limit:** 6,000 tokens (session restart recommended)
296  
297  ## Session Management
298  
299  ### Automatic Cleanup
300  
301  Sessions are automatically cleaned up after:
302  - **1 hour of inactivity** (no queries)
303  - Configurable in `apps/echo_shared/config/dev.exs`
304  
305  ### Manual Session Control
306  
307  ```elixir
308  # List all active sessions
309  EchoShared.LLM.Session.list_sessions()
310  # => [
311  #   %{session_id: "ceo_...", agent_role: :ceo, turn_count: 5, ...},
312  #   %{session_id: "cto_...", agent_role: :cto, turn_count: 2, ...}
313  # ]
314  
315  # Get session details
316  EchoShared.LLM.Session.get_session("ceo_1699564234_123456")
317  # => %{session_id: ..., conversation_history: [...], ...}
318  
319  # End session manually
320  EchoShared.LLM.Session.end_session("ceo_1699564234_123456")
321  # => {:ok, archived_conversation}
322  ```
323  
324  ## Configuration
325  
326  All configuration in `apps/echo_shared/config/dev.exs`:
327  
328  ```elixir
329  # LLM Session configuration
330  config :echo_shared, :llm_session,
331    max_turns: 5,                    # Conversation history depth
332    timeout_ms: 3_600_000,           # 1 hour inactivity timeout
333    cleanup_interval_ms: 900_000,    # Cleanup every 15 minutes
334    warning_threshold: 4_000,        # Warn at 4K tokens
335    limit_threshold: 6_000           # Critical at 6K tokens
336  
337  # Agent-specific models
338  config :echo_shared, :agent_models, %{
339    ceo: "llama3.1:8b",
340    cto: "deepseek-coder:6.7b",
341    chro: "llama3.1:8b",
342    operations_head: "mistral:7b",
343    product_manager: "llama3.1:8b",
344    senior_architect: "deepseek-coder:6.7b",
345    uiux_engineer: "llama3.1:8b",
346    senior_developer: "deepseek-coder:6.7b",
347    test_lead: "deepseek-coder:6.7b"
348  }
349  ```
350  
351  ### Override via Environment Variables
352  
353  ```bash
354  # Change CEO's model
355  export CEO_MODEL=qwen2.5:14b
356  
357  # Disable LLM for specific agent
358  export CEO_LLM_ENABLED=false
359  
360  # Change Ollama endpoint
361  export OLLAMA_ENDPOINT=http://192.168.1.100:11434
362  ```
363  
364  ## Comparison: LocalCode vs Agent Session Integration
365  
366  | Feature | LocalCode (Bash) | Agent LLM (Elixir) |
367  |---------|------------------|---------------------|
368  | **Session Management** | ✅ File-based | ✅ ETS-based (in-memory) |
369  | **Context Injection** | ✅ CLAUDE.md + git + status | ✅ Role + decisions + messages + git |
370  | **Conversation Memory** | ✅ Last 5 turns | ✅ Last 5 turns |
371  | **Context Warnings** | ✅ Yes | ✅ Yes |
372  | **Auto-Cleanup** | ❌ Manual (`lc_end`) | ✅ Automatic (1 hour timeout) |
373  | **Tool Simulation** | ✅ Yes (bash) | ❌ No (could add) |
374  | **Model** | deepseek-coder:6.7b | Role-specific (9 models) |
375  | **Response Time** | 7-30 seconds | 7-30 seconds |
376  | **Use Case** | CLI development assistant | Agent decision support |
377  
378  ## Testing
379  
380  ### Unit Test Example
381  
382  ```elixir
383  defmodule CEOTest do
384    use ExUnit.Case
385    alias CEO
386  
387    describe "session_consult tool" do
388      test "starts new session and returns response" do
389        args = %{"question" => "What should I prioritize?"}
390  
391        assert {:ok, result} = CEO.execute_tool("session_consult", args)
392        assert is_binary(result.response)
393        assert is_binary(result.session_id)
394        assert result.turn_count == 1
395        assert result.model == "llama3.1:8b"
396      end
397  
398      test "continues existing session" do
399        # First query
400        {:ok, result1} = CEO.execute_tool("session_consult", %{
401          "question" => "What's my role?"
402        })
403  
404        # Second query with session_id
405        {:ok, result2} = CEO.execute_tool("session_consult", %{
406          "session_id" => result1.session_id,
407          "question" => "Tell me more"
408        })
409  
410        assert result2.session_id == result1.session_id
411        assert result2.turn_count == 2
412      end
413  
414      test "returns error for invalid session" do
415        args = %{
416          "session_id" => "invalid_session_123",
417          "question" => "Test"
418        }
419  
420        assert {:error, message} = CEO.execute_tool("session_consult", args)
421        assert message =~ "Session not found"
422      end
423    end
424  end
425  ```
426  
427  ### Integration Test
428  
429  ```bash
430  # Start agent in autonomous mode
431  cd apps/echo_ceo
432  ./ceo --autonomous &
433  CEO_PID=$!
434  
435  # Test session_consult via IEx
436  iex -S mix
437  
438  iex> alias EchoShared.LLM.DecisionHelper
439  iex> {:ok, r1} = DecisionHelper.consult_session(:ceo, nil, "What's my role?")
440  iex> IO.puts(r1.response)
441  iex> {:ok, r2} = DecisionHelper.consult_session(:ceo, r1.session_id, "What are my priorities?")
442  iex> IO.puts(r2.response)
443  iex> EchoShared.LLM.Session.end_session(r1.session_id)
444  
445  # Cleanup
446  kill $CEO_PID
447  ```
448  
449  ## Troubleshooting
450  
451  ### "LLM is disabled"
452  
453  ```bash
454  # Enable globally
455  export LLM_ENABLED=true
456  
457  # Or per-agent
458  export CEO_LLM_ENABLED=true
459  ```
460  
461  ### "Session not found"
462  
463  Sessions expire after 1 hour of inactivity. Start a new session by omitting `session_id`.
464  
465  ### "Failed to get response from Ollama"
466  
467  ```bash
468  # Check Ollama is running
469  curl http://localhost:11434/api/tags
470  
471  # Check model is installed
472  ollama list | grep llama3.1
473  
474  # Pull model if missing
475  ollama pull llama3.1:8b
476  ```
477  
478  ### Slow responses (>60 seconds)
479  
480  ```bash
481  # Use smaller/faster model
482  export CEO_MODEL=deepseek-coder:1.3b
483  
484  # Or increase timeout in client.ex (default: 180 seconds)
485  ```
486  
487  ## Next Steps
488  
489  1. **Add to all 9 agents** - Copy the pattern to each agent module
490  2. **Test each agent** - Verify LLM responds correctly for each role
491  3. **Monitor performance** - Track response times and context sizes
492  4. **Optimize prompts** - Refine system prompts in `Config.get_system_prompt/1`
493  5. **Add tool simulation** (optional) - Similar to LocalCode's tool detection
494  
495  ## Related Files
496  
497  - `apps/echo_shared/lib/echo_shared/llm/session.ex` - Session manager
498  - `apps/echo_shared/lib/echo_shared/llm/context_builder.ex` - Context injection
499  - `apps/echo_shared/lib/echo_shared/llm/decision_helper.ex` - High-level API
500  - `apps/echo_shared/lib/echo_shared/llm/config.ex` - Model configuration
501  - `apps/echo_shared/lib/echo_shared/llm/client.ex` - Ollama HTTP client
502  - `apps/echo_shared/config/dev.exs` - Configuration
503  
504  ## Summary
505  
506  You now have **LocalCode-style conversation memory** for all ECHO agents! Each agent can:
507  
508  ✅ Have multi-turn conversations with project context
509  ✅ Remember last 5 turns automatically
510  ✅ Get warnings when context grows large
511  ✅ Auto-cleanup inactive sessions
512  ✅ Use role-specific specialized models
513  
514  **Integration effort:** ~50 lines of code per agent
515  **Benefits:** Project-aware AI assistance for autonomous decision-making
516  **Response quality:** Comparable to LocalCode, specialized per agent role
517  
518  Happy integrating! 🚀