Cradicle Explorer

/ AGENTS.md
AGENTS.md
  1  # Hermes Agent - Development Guide
  2  
  3  Instructions for AI coding assistants and developers working on the hermes-agent codebase.
  4  
  5  ## Development Environment
  6  
  7  ```bash
  8  # Prefer .venv; fall back to venv if that's what your checkout has.
  9  source .venv/bin/activate   # or: source venv/bin/activate
 10  ```
 11  
 12  `scripts/run_tests.sh` probes `.venv` first, then `venv`, then
 13  `$HOME/.hermes/hermes-agent/venv` (for worktrees that share a venv with the
 14  main checkout).
 15  
 16  ## Project Structure
 17  
 18  File counts shift constantly — don't treat the tree below as exhaustive.
 19  The canonical source is the filesystem. The notes call out the load-bearing
 20  entry points you'll actually edit.
 21  
 22  ```
 23  hermes-agent/
 24  ├── run_agent.py          # AIAgent class — core conversation loop (~12k LOC)
 25  ├── model_tools.py        # Tool orchestration, discover_builtin_tools(), handle_function_call()
 26  ├── toolsets.py           # Toolset definitions, _HERMES_CORE_TOOLS list
 27  ├── cli.py                # HermesCLI class — interactive CLI orchestrator (~11k LOC)
 28  ├── hermes_state.py       # SessionDB — SQLite session store (FTS5 search)
 29  ├── hermes_constants.py   # get_hermes_home(), display_hermes_home() — profile-aware paths
 30  ├── hermes_logging.py     # setup_logging() — agent.log / errors.log / gateway.log (profile-aware)
 31  ├── batch_runner.py       # Parallel batch processing
 32  ├── agent/                # Agent internals (provider adapters, memory, caching, compression, etc.)
 33  ├── hermes_cli/           # CLI subcommands, setup wizard, plugins loader, skin engine
 34  ├── tools/                # Tool implementations — auto-discovered via tools/registry.py
 35  │   └── environments/     # Terminal backends (local, docker, ssh, modal, daytona, singularity)
 36  ├── gateway/              # Messaging gateway — run.py + session.py + platforms/
 37  │   ├── platforms/        # Adapter per platform (telegram, discord, slack, whatsapp,
 38  │   │                     #   homeassistant, signal, matrix, mattermost, email, sms,
 39  │   │                     #   dingtalk, wecom, weixin, feishu, qqbot, bluebubbles,
 40  │   │                     #   webhook, api_server, ...). See ADDING_A_PLATFORM.md.
 41  │   └── builtin_hooks/    # Extension point for always-registered gateway hooks (none shipped)
 42  ├── plugins/              # Plugin system (see "Plugins" section below)
 43  │   ├── memory/           # Memory-provider plugins (honcho, mem0, supermemory, ...)
 44  │   ├── context_engine/   # Context-engine plugins
 45  │   └── <others>/         # Dashboard, image-gen, disk-cleanup, examples, ...
 46  ├── optional-skills/      # Heavier/niche skills shipped but NOT active by default
 47  ├── skills/               # Built-in skills bundled with the repo
 48  ├── ui-tui/               # Ink (React) terminal UI — `hermes --tui`
 49  │   └── src/              # entry.tsx, app.tsx, gatewayClient.ts + app/components/hooks/lib
 50  ├── tui_gateway/          # Python JSON-RPC backend for the TUI
 51  ├── acp_adapter/          # ACP server (VS Code / Zed / JetBrains integration)
 52  ├── cron/                 # Scheduler — jobs.py, scheduler.py
 53  ├── environments/         # RL training environments (Atropos)
 54  ├── scripts/              # run_tests.sh, release.py, auxiliary scripts
 55  ├── website/              # Docusaurus docs site
 56  └── tests/                # Pytest suite (~15k tests across ~700 files as of Apr 2026)
 57  ```
 58  
 59  **User config:** `~/.hermes/config.yaml` (settings), `~/.hermes/.env` (API keys only).
 60  **Logs:** `~/.hermes/logs/` — `agent.log` (INFO+), `errors.log` (WARNING+),
 61  `gateway.log` when running the gateway. Profile-aware via `get_hermes_home()`.
 62  Browse with `hermes logs [--follow] [--level ...] [--session ...]`.
 63  
 64  ## File Dependency Chain
 65  
 66  ```
 67  tools/registry.py  (no deps — imported by all tool files)
 68         ↑
 69  tools/*.py  (each calls registry.register() at import time)
 70         ↑
 71  model_tools.py  (imports tools/registry + triggers tool discovery)
 72         ↑
 73  run_agent.py, cli.py, batch_runner.py, environments/
 74  ```
 75  
 76  ---
 77  
 78  ## AIAgent Class (run_agent.py)
 79  
 80  The real `AIAgent.__init__` takes ~60 parameters (credentials, routing, callbacks,
 81  session context, budget, credential pool, etc.). The signature below is the
 82  minimum subset you'll usually touch — read `run_agent.py` for the full list.
 83  
 84  ```python
 85  class AIAgent:
 86      def __init__(self,
 87          base_url: str = None,
 88          api_key: str = None,
 89          provider: str = None,
 90          api_mode: str = None,              # "chat_completions" | "codex_responses" | ...
 91          model: str = "",                   # empty → resolved from config/provider later
 92          max_iterations: int = 90,          # tool-calling iterations (shared with subagents)
 93          enabled_toolsets: list = None,
 94          disabled_toolsets: list = None,
 95          quiet_mode: bool = False,
 96          save_trajectories: bool = False,
 97          platform: str = None,              # "cli", "telegram", etc.
 98          session_id: str = None,
 99          skip_context_files: bool = False,
100          skip_memory: bool = False,
101          credential_pool=None,
102          # ... plus callbacks, thread/user/chat IDs, iteration_budget, fallback_model,
103          # checkpoints config, prefill_messages, service_tier, reasoning_config, etc.
104      ): ...
105  
106      def chat(self, message: str) -> str:
107          """Simple interface — returns final response string."""
108  
109      def run_conversation(self, user_message: str, system_message: str = None,
110                           conversation_history: list = None, task_id: str = None) -> dict:
111          """Full interface — returns dict with final_response + messages."""
112  ```
113  
114  ### Agent Loop
115  
116  The core loop is inside `run_conversation()` — entirely synchronous, with
117  interrupt checks, budget tracking, and a one-turn grace call:
118  
119  ```python
120  while (api_call_count < self.max_iterations and self.iteration_budget.remaining > 0) \
121          or self._budget_grace_call:
122      if self._interrupt_requested: break
123      response = client.chat.completions.create(model=model, messages=messages, tools=tool_schemas)
124      if response.tool_calls:
125          for tool_call in response.tool_calls:
126              result = handle_function_call(tool_call.name, tool_call.args, task_id)
127              messages.append(tool_result_message(result))
128          api_call_count += 1
129      else:
130          return response.content
131  ```
132  
133  Messages follow OpenAI format: `{"role": "system/user/assistant/tool", ...}`.
134  Reasoning content is stored in `assistant_msg["reasoning"]`.
135  
136  ---
137  
138  ## CLI Architecture (cli.py)
139  
140  - **Rich** for banner/panels, **prompt_toolkit** for input with autocomplete
141  - **KawaiiSpinner** (`agent/display.py`) — animated faces during API calls, `┊` activity feed for tool results
142  - `load_cli_config()` in cli.py merges hardcoded defaults + user config YAML
143  - **Skin engine** (`hermes_cli/skin_engine.py`) — data-driven CLI theming; initialized from `display.skin` config key at startup; skins customize banner colors, spinner faces/verbs/wings, tool prefix, response box, branding text
144  - `process_command()` is a method on `HermesCLI` — dispatches on canonical command name resolved via `resolve_command()` from the central registry
145  - Skill slash commands: `agent/skill_commands.py` scans `~/.hermes/skills/`, injects as **user message** (not system prompt) to preserve prompt caching
146  
147  ### Slash Command Registry (`hermes_cli/commands.py`)
148  
149  All slash commands are defined in a central `COMMAND_REGISTRY` list of `CommandDef` objects. Every downstream consumer derives from this registry automatically:
150  
151  - **CLI** — `process_command()` resolves aliases via `resolve_command()`, dispatches on canonical name
152  - **Gateway** — `GATEWAY_KNOWN_COMMANDS` frozenset for hook emission, `resolve_command()` for dispatch
153  - **Gateway help** — `gateway_help_lines()` generates `/help` output
154  - **Telegram** — `telegram_bot_commands()` generates the BotCommand menu
155  - **Slack** — `slack_subcommand_map()` generates `/hermes` subcommand routing
156  - **Autocomplete** — `COMMANDS` flat dict feeds `SlashCommandCompleter`
157  - **CLI help** — `COMMANDS_BY_CATEGORY` dict feeds `show_help()`
158  
159  ### Adding a Slash Command
160  
161  1. Add a `CommandDef` entry to `COMMAND_REGISTRY` in `hermes_cli/commands.py`:
162  ```python
163  CommandDef("mycommand", "Description of what it does", "Session",
164             aliases=("mc",), args_hint="[arg]"),
165  ```
166  2. Add handler in `HermesCLI.process_command()` in `cli.py`:
167  ```python
168  elif canonical == "mycommand":
169      self._handle_mycommand(cmd_original)
170  ```
171  3. If the command is available in the gateway, add a handler in `gateway/run.py`:
172  ```python
173  if canonical == "mycommand":
174      return await self._handle_mycommand(event)
175  ```
176  4. For persistent settings, use `save_config_value()` in `cli.py`
177  
178  **CommandDef fields:**
179  - `name` — canonical name without slash (e.g. `"background"`)
180  - `description` — human-readable description
181  - `category` — one of `"Session"`, `"Configuration"`, `"Tools & Skills"`, `"Info"`, `"Exit"`
182  - `aliases` — tuple of alternative names (e.g. `("bg",)`)
183  - `args_hint` — argument placeholder shown in help (e.g. `"<prompt>"`, `"[name]"`)
184  - `cli_only` — only available in the interactive CLI
185  - `gateway_only` — only available in messaging platforms
186  - `gateway_config_gate` — config dotpath (e.g. `"display.tool_progress_command"`); when set on a `cli_only` command, the command becomes available in the gateway if the config value is truthy. `GATEWAY_KNOWN_COMMANDS` always includes config-gated commands so the gateway can dispatch them; help/menus only show them when the gate is open.
187  
188  **Adding an alias** requires only adding it to the `aliases` tuple on the existing `CommandDef`. No other file changes needed — dispatch, help text, Telegram menu, Slack mapping, and autocomplete all update automatically.
189  
190  ---
191  
192  ## TUI Architecture (ui-tui + tui_gateway)
193  
194  The TUI is a full replacement for the classic (prompt_toolkit) CLI, activated via `hermes --tui` or `HERMES_TUI=1`.
195  
196  ### Process Model
197  
198  ```
199  hermes --tui
200    └─ Node (Ink)  ──stdio JSON-RPC──  Python (tui_gateway)
201         │                                  └─ AIAgent + tools + sessions
202         └─ renders transcript, composer, prompts, activity
203  ```
204  
205  TypeScript owns the screen. Python owns sessions, tools, model calls, and slash command logic.
206  
207  ### Transport
208  
209  Newline-delimited JSON-RPC over stdio. Requests from Ink, events from Python. See `tui_gateway/server.py` for the full method/event catalog.
210  
211  ### Key Surfaces
212  
213  | Surface | Ink component | Gateway method |
214  |---------|---------------|----------------|
215  | Chat streaming | `app.tsx` + `messageLine.tsx` | `prompt.submit` → `message.delta/complete` |
216  | Tool activity | `thinking.tsx` | `tool.start/progress/complete` |
217  | Approvals | `prompts.tsx` | `approval.respond` ← `approval.request` |
218  | Clarify/sudo/secret | `prompts.tsx`, `maskedPrompt.tsx` | `clarify/sudo/secret.respond` |
219  | Session picker | `sessionPicker.tsx` | `session.list/resume` |
220  | Slash commands | Local handler + fallthrough | `slash.exec` → `_SlashWorker`, `command.dispatch` |
221  | Completions | `useCompletion` hook | `complete.slash`, `complete.path` |
222  | Theming | `theme.ts` + `branding.tsx` | `gateway.ready` with skin data |
223  
224  ### Slash Command Flow
225  
226  1. Built-in client commands (`/help`, `/quit`, `/clear`, `/resume`, `/copy`, `/paste`, etc.) handled locally in `app.tsx`
227  2. Everything else → `slash.exec` (runs in persistent `_SlashWorker` subprocess) → `command.dispatch` fallback
228  
229  ### Dev Commands
230  
231  ```bash
232  cd ui-tui
233  npm install       # first time
234  npm run dev       # watch mode (rebuilds hermes-ink + tsx --watch)
235  npm start         # production
236  npm run build     # full build (hermes-ink + tsc)
237  npm run type-check # typecheck only (tsc --noEmit)
238  npm run lint      # eslint
239  npm run fmt       # prettier
240  npm test          # vitest
241  ```
242  
243  ### TUI in the Dashboard (`hermes dashboard` → `/chat`)
244  
245  The dashboard embeds the real `hermes --tui` — **not** a rewrite.  See `hermes_cli/pty_bridge.py` + the `@app.websocket("/api/pty")` endpoint in `hermes_cli/web_server.py`.
246  
247  - Browser loads `web/src/pages/ChatPage.tsx`, which mounts xterm.js's `Terminal` with the WebGL renderer, `@xterm/addon-fit` for container-driven resize, and `@xterm/addon-unicode11` for modern wide-character widths.
248  - `/api/pty?token=…` upgrades to a WebSocket; auth uses the same ephemeral `_SESSION_TOKEN` as REST, via query param (browsers can't set `Authorization` on WS upgrade).
249  - The server spawns whatever `hermes --tui` would spawn, through `ptyprocess` (POSIX PTY — WSL works, native Windows does not).
250  - Frames: raw PTY bytes each direction; resize via `\x1b[RESIZE:<cols>;<rows>]` intercepted on the server and applied with `TIOCSWINSZ`.
251  
252  **Do not re-implement the primary chat experience in React.** The main transcript, composer/input flow (including slash-command behavior), and PTY-backed terminal belong to the embedded `hermes --tui` — anything new you add to Ink shows up in the dashboard automatically. If you find yourself rebuilding the transcript or composer for the dashboard, stop and extend Ink instead.
253  
254  **Structured React UI around the TUI is allowed when it is not a second chat surface.** Sidebar widgets, inspectors, summaries, status panels, and similar supporting views (e.g. `ChatSidebar`, `ModelPickerDialog`, `ToolCall`) are fine when they complement the embedded TUI rather than replacing the transcript / composer / terminal. Keep their state independent of the PTY child's session and surface their failures non-destructively so the terminal pane keeps working unimpaired.
255  
256  ---
257  
258  ## Adding New Tools
259  
260  For most custom or local-only tools, do **not** edit Hermes core. Use the plugin
261  route instead: create `~/.hermes/plugins/<name>/plugin.yaml` and
262  `~/.hermes/plugins/<name>/__init__.py`, then register tools with
263  `ctx.register_tool(...)`. Plugin toolsets are discovered automatically and can be
264  enabled or disabled without touching `tools/` or `toolsets.py`.
265  
266  Use the built-in route below only when the user is explicitly contributing a new
267  core Hermes tool that should ship in the base system.
268  
269  Built-in/core tools require changes in **2 files**:
270  
271  **1. Create `tools/your_tool.py`:**
272  ```python
273  import json, os
274  from tools.registry import registry
275  
276  def check_requirements() -> bool:
277      return bool(os.getenv("EXAMPLE_API_KEY"))
278  
279  def example_tool(param: str, task_id: str = None) -> str:
280      return json.dumps({"success": True, "data": "..."})
281  
282  registry.register(
283      name="example_tool",
284      toolset="example",
285      schema={"name": "example_tool", "description": "...", "parameters": {...}},
286      handler=lambda args, **kw: example_tool(param=args.get("param", ""), task_id=kw.get("task_id")),
287      check_fn=check_requirements,
288      requires_env=["EXAMPLE_API_KEY"],
289  )
290  ```
291  
292  **2. Add to `toolsets.py`** — either `_HERMES_CORE_TOOLS` (all platforms) or a new toolset.
293  
294  Auto-discovery: any `tools/*.py` file with a top-level `registry.register()` call is imported automatically — no manual import list to maintain.
295  
296  The registry handles schema collection, dispatch, availability checking, and error wrapping. All handlers MUST return a JSON string.
297  
298  **Path references in tool schemas**: If the schema description mentions file paths (e.g. default output directories), use `display_hermes_home()` to make them profile-aware. The schema is generated at import time, which is after `_apply_profile_override()` sets `HERMES_HOME`.
299  
300  **State files**: If a tool stores persistent state (caches, logs, checkpoints), use `get_hermes_home()` for the base directory — never `Path.home() / ".hermes"`. This ensures each profile gets its own state.
301  
302  **Agent-level tools** (todo, memory): intercepted by `run_agent.py` before `handle_function_call()`. See `tools/todo_tool.py` for the pattern.
303  
304  ---
305  
306  ## Adding Configuration
307  
308  ### config.yaml options:
309  1. Add to `DEFAULT_CONFIG` in `hermes_cli/config.py`
310  2. Bump `_config_version` (check the current value at the top of `DEFAULT_CONFIG`)
311     ONLY if you need to actively migrate/transform existing user config
312     (renaming keys, changing structure). Adding a new key to an existing
313     section is handled automatically by the deep-merge and does NOT require
314     a version bump.
315  
316  ### .env variables (SECRETS ONLY — API keys, tokens, passwords):
317  1. Add to `OPTIONAL_ENV_VARS` in `hermes_cli/config.py` with metadata:
318  ```python
319  "NEW_API_KEY": {
320      "description": "What it's for",
321      "prompt": "Display name",
322      "url": "https://...",
323      "password": True,
324      "category": "tool",  # provider, tool, messaging, setting
325  },
326  ```
327  
328  Non-secret settings (timeouts, thresholds, feature flags, paths, display
329  preferences) belong in `config.yaml`, not `.env`. If internal code needs an
330  env var mirror for backward compatibility, bridge it from `config.yaml` to
331  the env var in code (see `gateway_timeout`, `terminal.cwd` → `TERMINAL_CWD`).
332  
333  ### Config loaders (three paths — know which one you're in):
334  
335  | Loader | Used by | Location |
336  |--------|---------|----------|
337  | `load_cli_config()` | CLI mode | `cli.py` — merges CLI-specific defaults + user YAML |
338  | `load_config()` | `hermes tools`, `hermes setup`, most CLI subcommands | `hermes_cli/config.py` — merges `DEFAULT_CONFIG` + user YAML |
339  | Direct YAML load | Gateway runtime | `gateway/run.py` + `gateway/config.py` — reads user YAML raw |
340  
341  If you add a new key and the CLI sees it but the gateway doesn't (or vice
342  versa), you're on the wrong loader. Check `DEFAULT_CONFIG` coverage.
343  
344  ### Working directory:
345  - **CLI** — uses the process's current directory (`os.getcwd()`).
346  - **Messaging** — uses `terminal.cwd` from `config.yaml`. The gateway bridges this
347    to the `TERMINAL_CWD` env var for child tools. **`MESSAGING_CWD` has been
348    removed** — the config loader prints a deprecation warning if it's set in
349    `.env`. Same for `TERMINAL_CWD` in `.env`; the canonical setting is
350    `terminal.cwd` in `config.yaml`.
351  
352  ---
353  
354  ## Skin/Theme System
355  
356  The skin engine (`hermes_cli/skin_engine.py`) provides data-driven CLI visual customization. Skins are **pure data** — no code changes needed to add a new skin.
357  
358  ### Architecture
359  
360  ```
361  hermes_cli/skin_engine.py    # SkinConfig dataclass, built-in skins, YAML loader
362  ~/.hermes/skins/*.yaml       # User-installed custom skins (drop-in)
363  ```
364  
365  - `init_skin_from_config()` — called at CLI startup, reads `display.skin` from config
366  - `get_active_skin()` — returns cached `SkinConfig` for the current skin
367  - `set_active_skin(name)` — switches skin at runtime (used by `/skin` command)
368  - `load_skin(name)` — loads from user skins first, then built-ins, then falls back to default
369  - Missing skin values inherit from the `default` skin automatically
370  
371  ### What skins customize
372  
373  | Element | Skin Key | Used By |
374  |---------|----------|---------|
375  | Banner panel border | `colors.banner_border` | `banner.py` |
376  | Banner panel title | `colors.banner_title` | `banner.py` |
377  | Banner section headers | `colors.banner_accent` | `banner.py` |
378  | Banner dim text | `colors.banner_dim` | `banner.py` |
379  | Banner body text | `colors.banner_text` | `banner.py` |
380  | Response box border | `colors.response_border` | `cli.py` |
381  | Spinner faces (waiting) | `spinner.waiting_faces` | `display.py` |
382  | Spinner faces (thinking) | `spinner.thinking_faces` | `display.py` |
383  | Spinner verbs | `spinner.thinking_verbs` | `display.py` |
384  | Spinner wings (optional) | `spinner.wings` | `display.py` |
385  | Tool output prefix | `tool_prefix` | `display.py` |
386  | Per-tool emojis | `tool_emojis` | `display.py` → `get_tool_emoji()` |
387  | Agent name | `branding.agent_name` | `banner.py`, `cli.py` |
388  | Welcome message | `branding.welcome` | `cli.py` |
389  | Response box label | `branding.response_label` | `cli.py` |
390  | Prompt symbol | `branding.prompt_symbol` | `cli.py` |
391  
392  ### Built-in skins
393  
394  - `default` — Classic Hermes gold/kawaii (the current look)
395  - `ares` — Crimson/bronze war-god theme with custom spinner wings
396  - `mono` — Clean grayscale monochrome
397  - `slate` — Cool blue developer-focused theme
398  
399  ### Adding a built-in skin
400  
401  Add to `_BUILTIN_SKINS` dict in `hermes_cli/skin_engine.py`:
402  
403  ```python
404  "mytheme": {
405      "name": "mytheme",
406      "description": "Short description",
407      "colors": { ... },
408      "spinner": { ... },
409      "branding": { ... },
410      "tool_prefix": "┊",
411  },
412  ```
413  
414  ### User skins (YAML)
415  
416  Users create `~/.hermes/skins/<name>.yaml`:
417  
418  ```yaml
419  name: cyberpunk
420  description: Neon-soaked terminal theme
421  
422  colors:
423    banner_border: "#FF00FF"
424    banner_title: "#00FFFF"
425    banner_accent: "#FF1493"
426  
427  spinner:
428    thinking_verbs: ["jacking in", "decrypting", "uploading"]
429    wings:
430      - ["⟨⚡", "⚡⟩"]
431  
432  branding:
433    agent_name: "Cyber Agent"
434    response_label: " ⚡ Cyber "
435  
436  tool_prefix: "▏"
437  ```
438  
439  Activate with `/skin cyberpunk` or `display.skin: cyberpunk` in config.yaml.
440  
441  ---
442  
443  ## Plugins
444  
445  Hermes has two plugin surfaces. Both live under `plugins/` in the repo so
446  repo-shipped plugins can be discovered alongside user-installed ones in
447  `~/.hermes/plugins/` and pip-installed entry points.
448  
449  ### General plugins (`hermes_cli/plugins.py` + `plugins/<name>/`)
450  
451  `PluginManager` discovers plugins from `~/.hermes/plugins/`, `./.hermes/plugins/`,
452  and pip entry points. Each plugin exposes a `register(ctx)` function that
453  can:
454  
455  - Register Python-callback lifecycle hooks:
456    `pre_tool_call`, `post_tool_call`, `pre_llm_call`, `post_llm_call`,
457    `on_session_start`, `on_session_end`
458  - Register new tools via `ctx.register_tool(...)`
459  - Register CLI subcommands via `ctx.register_cli_command(...)` — the
460    plugin's argparse tree is wired into `hermes` at startup so
461    `hermes <pluginname> <subcmd>` works with no change to `main.py`
462  
463  Hooks are invoked from `model_tools.py` (pre/post tool) and `run_agent.py`
464  (lifecycle). **Discovery timing pitfall:** `discover_plugins()` only runs
465  as a side effect of importing `model_tools.py`. Code paths that read plugin
466  state without importing `model_tools.py` first must call `discover_plugins()`
467  explicitly (it's idempotent).
468  
469  ### Memory-provider plugins (`plugins/memory/<name>/`)
470  
471  Separate discovery system for pluggable memory backends. Current built-in
472  providers include **honcho, mem0, supermemory, byterover, hindsight,
473  holographic, openviking, retaindb**.
474  
475  Each provider implements the `MemoryProvider` ABC (see `agent/memory_provider.py`)
476  and is orchestrated by `agent/memory_manager.py`. Lifecycle hooks include
477  `sync_turn(turn_messages)`, `prefetch(query)`, `shutdown()`, and optional
478  `post_setup(hermes_home, config)` for setup-wizard integration.
479  
480  **CLI commands via `plugins/memory/<name>/cli.py`:** if a memory plugin
481  defines `register_cli(subparser)`, `discover_plugin_cli_commands()` finds
482  it at argparse setup time and wires it into `hermes <plugin>`. The
483  framework only exposes CLI commands for the **currently active** memory
484  provider (read from `memory.provider` in config.yaml), so disabled
485  providers don't clutter `hermes --help`.
486  
487  **Rule (Teknium, May 2026):** plugins MUST NOT modify core files
488  (`run_agent.py`, `cli.py`, `gateway/run.py`, `hermes_cli/main.py`, etc.).
489  If a plugin needs a capability the framework doesn't expose, expand the
490  generic plugin surface (new hook, new ctx method) — never hardcode
491  plugin-specific logic into core. PR #5295 removed 95 lines of hardcoded
492  honcho argparse from `main.py` for exactly this reason.
493  
494  ### Dashboard / context-engine / image-gen plugin directories
495  
496  `plugins/context_engine/`, `plugins/image_gen/`, `plugins/example-dashboard/`,
497  etc. follow the same pattern (ABC + orchestrator + per-plugin directory).
498  Context engines plug into `agent/context_engine.py`; image-gen providers
499  into `agent/image_gen_provider.py`.
500  
501  ---
502  
503  ## Skills
504  
505  Two parallel surfaces:
506  
507  - **`skills/`** — built-in skills shipped and loadable by default.
508    Organized by category directories (e.g. `skills/github/`, `skills/mlops/`).
509  - **`optional-skills/`** — heavier or niche skills shipped with the repo but
510    NOT active by default. Installed explicitly via
511    `hermes skills install official/<category>/<skill>`. Adapter lives in
512    `tools/skills_hub.py` (`OptionalSkillSource`). Categories include
513    `autonomous-ai-agents`, `blockchain`, `communication`, `creative`,
514    `devops`, `email`, `health`, `mcp`, `migration`, `mlops`, `productivity`,
515    `research`, `security`, `web-development`.
516  
517  When reviewing skill PRs, check which directory they target — heavy-dep or
518  niche skills belong in `optional-skills/`.
519  
520  ### SKILL.md frontmatter
521  
522  Standard fields: `name`, `description`, `version`, `platforms`
523  (OS-gating list: `[macos]`, `[linux, macos]`, ...),
524  `metadata.hermes.tags`, `metadata.hermes.category`,
525  `metadata.hermes.config` (config.yaml settings the skill needs — stored
526  under `skills.config.<key>`, prompted during setup, injected at load time).
527  
528  ---
529  
530  ## Important Policies
531  
532  ### Prompt Caching Must Not Break
533  
534  Hermes-Agent ensures caching remains valid throughout a conversation. **Do NOT implement changes that would:**
535  - Alter past context mid-conversation
536  - Change toolsets mid-conversation
537  - Reload memories or rebuild system prompts mid-conversation
538  
539  Cache-breaking forces dramatically higher costs. The ONLY time we alter context is during context compression.
540  
541  Slash commands that mutate system-prompt state (skills, tools, memory, etc.)
542  must be **cache-aware**: default to deferred invalidation (change takes
543  effect next session), with an opt-in `--now` flag for immediate
544  invalidation. See `/skills install --now` for the canonical pattern.
545  
546  ### Background Process Notifications (Gateway)
547  
548  When `terminal(background=true, notify_on_complete=true)` is used, the gateway runs a watcher that
549  detects process completion and triggers a new agent turn. Control verbosity of background process
550  messages with `display.background_process_notifications`
551  in config.yaml (or `HERMES_BACKGROUND_NOTIFICATIONS` env var):
552  
553  - `all` — running-output updates + final message (default)
554  - `result` — only the final completion message
555  - `error` — only the final message when exit code != 0
556  - `off` — no watcher messages at all
557  
558  ---
559  
560  ## Profiles: Multi-Instance Support
561  
562  Hermes supports **profiles** — multiple fully isolated instances, each with its own
563  `HERMES_HOME` directory (config, API keys, memory, sessions, skills, gateway, etc.).
564  
565  The core mechanism: `_apply_profile_override()` in `hermes_cli/main.py` sets
566  `HERMES_HOME` before any module imports. All `get_hermes_home()` references
567  automatically scope to the active profile.
568  
569  ### Rules for profile-safe code
570  
571  1. **Use `get_hermes_home()` for all HERMES_HOME paths.** Import from `hermes_constants`.
572     NEVER hardcode `~/.hermes` or `Path.home() / ".hermes"` in code that reads/writes state.
573     ```python
574     # GOOD
575     from hermes_constants import get_hermes_home
576     config_path = get_hermes_home() / "config.yaml"
577  
578     # BAD — breaks profiles
579     config_path = Path.home() / ".hermes" / "config.yaml"
580     ```
581  
582  2. **Use `display_hermes_home()` for user-facing messages.** Import from `hermes_constants`.
583     This returns `~/.hermes` for default or `~/.hermes/profiles/<name>` for profiles.
584     ```python
585     # GOOD
586     from hermes_constants import display_hermes_home
587     print(f"Config saved to {display_hermes_home()}/config.yaml")
588  
589     # BAD — shows wrong path for profiles
590     print("Config saved to ~/.hermes/config.yaml")
591     ```
592  
593  3. **Module-level constants are fine** — they cache `get_hermes_home()` at import time,
594     which is AFTER `_apply_profile_override()` sets the env var. Just use `get_hermes_home()`,
595     not `Path.home() / ".hermes"`.
596  
597  4. **Tests that mock `Path.home()` must also set `HERMES_HOME`** — since code now uses
598     `get_hermes_home()` (reads env var), not `Path.home() / ".hermes"`:
599     ```python
600     with patch.object(Path, "home", return_value=tmp_path), \
601          patch.dict(os.environ, {"HERMES_HOME": str(tmp_path / ".hermes")}):
602         ...
603     ```
604  
605  5. **Gateway platform adapters should use token locks** — if the adapter connects with
606     a unique credential (bot token, API key), call `acquire_scoped_lock()` from
607     `gateway.status` in the `connect()`/`start()` method and `release_scoped_lock()` in
608     `disconnect()`/`stop()`. This prevents two profiles from using the same credential.
609     See `gateway/platforms/telegram.py` for the canonical pattern.
610  
611  6. **Profile operations are HOME-anchored, not HERMES_HOME-anchored** — `_get_profiles_root()`
612     returns `Path.home() / ".hermes" / "profiles"`, NOT `get_hermes_home() / "profiles"`.
613     This is intentional — it lets `hermes -p coder profile list` see all profiles regardless
614     of which one is active.
615  
616  ## Known Pitfalls
617  
618  ### DO NOT hardcode `~/.hermes` paths
619  Use `get_hermes_home()` from `hermes_constants` for code paths. Use `display_hermes_home()`
620  for user-facing print/log messages. Hardcoding `~/.hermes` breaks profiles — each profile
621  has its own `HERMES_HOME` directory. This was the source of 5 bugs fixed in PR #3575.
622  
623  ### DO NOT introduce new `simple_term_menu` usage
624  Existing call sites in `hermes_cli/main.py` remain for legacy fallback only;
625  the preferred UI is curses (stdlib) because `simple_term_menu` has
626  ghost-duplication rendering bugs in tmux/iTerm2 with arrow keys. New
627  interactive menus must use `hermes_cli/curses_ui.py` — see
628  `hermes_cli/tools_config.py` for the canonical pattern.
629  
630  ### DO NOT use `\033[K` (ANSI erase-to-EOL) in spinner/display code
631  Leaks as literal `?[K` text under `prompt_toolkit`'s `patch_stdout`. Use space-padding: `f"\r{line}{' ' * pad}"`.
632  
633  ### `_last_resolved_tool_names` is a process-global in `model_tools.py`
634  `_run_single_child()` in `delegate_tool.py` saves and restores this global around subagent execution. If you add new code that reads this global, be aware it may be temporarily stale during child agent runs.
635  
636  ### DO NOT hardcode cross-tool references in schema descriptions
637  Tool schema descriptions must not mention tools from other toolsets by name (e.g., `browser_navigate` saying "prefer web_search"). Those tools may be unavailable (missing API keys, disabled toolset), causing the model to hallucinate calls to non-existent tools. If a cross-reference is needed, add it dynamically in `get_tool_definitions()` in `model_tools.py` — see the `browser_navigate` / `execute_code` post-processing blocks for the pattern.
638  
639  ### The gateway has TWO message guards — both must bypass approval/control commands
640  When an agent is running, messages pass through two sequential guards:
641  (1) **base adapter** (`gateway/platforms/base.py`) queues messages in
642  `_pending_messages` when `session_key in self._active_sessions`, and
643  (2) **gateway runner** (`gateway/run.py`) intercepts `/stop`, `/new`,
644  `/queue`, `/status`, `/approve`, `/deny` before they reach
645  `running_agent.interrupt()`. Any new command that must reach the runner
646  while the agent is blocked (e.g. approval prompts) MUST bypass BOTH
647  guards and be dispatched inline, not via `_process_message_background()`
648  (which races session lifecycle).
649  
650  ### Squash merges from stale branches silently revert recent fixes
651  Before squash-merging a PR, ensure the branch is up to date with `main`
652  (`git fetch origin main && git reset --hard origin/main` in the worktree,
653  then re-apply the PR's commits). A stale branch's version of an unrelated
654  file will silently overwrite recent fixes on main when squashed. Verify
655  with `git diff HEAD~1..HEAD` after merging — unexpected deletions are a
656  red flag.
657  
658  ### Don't wire in dead code without E2E validation
659  Unused code that was never shipped was dead for a reason. Before wiring an
660  unused module into a live code path, E2E test the real resolution chain
661  with actual imports (not mocks) against a temp `HERMES_HOME`.
662  
663  ### Tests must not write to `~/.hermes/`
664  The `_isolate_hermes_home` autouse fixture in `tests/conftest.py` redirects `HERMES_HOME` to a temp dir. Never hardcode `~/.hermes/` paths in tests.
665  
666  **Profile tests**: When testing profile features, also mock `Path.home()` so that
667  `_get_profiles_root()` and `_get_default_hermes_home()` resolve within the temp dir.
668  Use the pattern from `tests/hermes_cli/test_profiles.py`:
669  ```python
670  @pytest.fixture
671  def profile_env(tmp_path, monkeypatch):
672      home = tmp_path / ".hermes"
673      home.mkdir()
674      monkeypatch.setattr(Path, "home", lambda: tmp_path)
675      monkeypatch.setenv("HERMES_HOME", str(home))
676      return home
677  ```
678  
679  ---
680  
681  ## Testing
682  
683  **ALWAYS use `scripts/run_tests.sh`** — do not call `pytest` directly. The script enforces
684  hermetic environment parity with CI (unset credential vars, TZ=UTC, LANG=C.UTF-8,
685  4 xdist workers matching GHA ubuntu-latest). Direct `pytest` on a 16+ core
686  developer machine with API keys set diverges from CI in ways that have caused
687  multiple "works locally, fails in CI" incidents (and the reverse).
688  
689  ```bash
690  scripts/run_tests.sh                                  # full suite, CI-parity
691  scripts/run_tests.sh tests/gateway/                   # one directory
692  scripts/run_tests.sh tests/agent/test_foo.py::test_x  # one test
693  scripts/run_tests.sh -v --tb=long                     # pass-through pytest flags
694  ```
695  
696  ### Why the wrapper (and why the old "just call pytest" doesn't work)
697  
698  Five real sources of local-vs-CI drift the script closes:
699  
700  | | Without wrapper | With wrapper |
701  |---|---|---|
702  | Provider API keys | Whatever is in your env (auto-detects pool) | All `*_API_KEY`/`*_TOKEN`/etc. unset |
703  | HOME / `~/.hermes/` | Your real config+auth.json | Temp dir per test |
704  | Timezone | Local TZ (PDT etc.) | UTC |
705  | Locale | Whatever is set | C.UTF-8 |
706  | xdist workers | `-n auto` = all cores (20+ on a workstation) | `-n 4` matching CI |
707  
708  `tests/conftest.py` also enforces points 1-4 as an autouse fixture so ANY pytest
709  invocation (including IDE integrations) gets hermetic behavior — but the wrapper
710  is belt-and-suspenders.
711  
712  ### Running without the wrapper (only if you must)
713  
714  If you can't use the wrapper (e.g. on Windows or inside an IDE that shells
715  pytest directly), at minimum activate the venv and pass `-n 4`:
716  
717  ```bash
718  source .venv/bin/activate   # or: source venv/bin/activate
719  python -m pytest tests/ -q -n 4
720  ```
721  
722  Worker count above 4 will surface test-ordering flakes that CI never sees.
723  
724  Always run the full suite before pushing changes.
725  
726  ### Don't write change-detector tests
727  
728  A test is a **change-detector** if it fails whenever data that is **expected
729  to change** gets updated — model catalogs, config version numbers,
730  enumeration counts, hardcoded lists of provider models. These tests add no
731  behavioral coverage; they just guarantee that routine source updates break
732  CI and cost engineering time to "fix."
733  
734  **Do not write:**
735  
736  ```python
737  # catalog snapshot — breaks every model release
738  assert "gemini-2.5-pro" in _PROVIDER_MODELS["gemini"]
739  assert "MiniMax-M2.7" in models
740  
741  # config version literal — breaks every schema bump
742  assert DEFAULT_CONFIG["_config_version"] == 21
743  
744  # enumeration count — breaks every time a skill/provider is added
745  assert len(_PROVIDER_MODELS["huggingface"]) == 8
746  ```
747  
748  **Do write:**
749  
750  ```python
751  # behavior: does the catalog plumbing work at all?
752  assert "gemini" in _PROVIDER_MODELS
753  assert len(_PROVIDER_MODELS["gemini"]) >= 1
754  
755  # behavior: does migration bump the user's version to current latest?
756  assert raw["_config_version"] == DEFAULT_CONFIG["_config_version"]
757  
758  # invariant: no plan-only model leaks into the legacy list
759  assert not (set(moonshot_models) & coding_plan_only_models)
760  
761  # invariant: every model in the catalog has a context-length entry
762  for m in _PROVIDER_MODELS["huggingface"]:
763      assert m.lower() in DEFAULT_CONTEXT_LENGTHS_LOWER
764  ```
765  
766  The rule: if the test reads like a snapshot of current data, delete it. If
767  it reads like a contract about how two pieces of data must relate, keep it.
768  When a PR adds a new provider/model and you want a test, make the test
769  assert the relationship (e.g. "catalog entries all have context lengths"),
770  not the specific names.
771  
772  Reviewers should reject new change-detector tests; authors should convert
773  them into invariants before re-requesting review.