/ CLAUDE.md
CLAUDE.md
  1  # CLAUDE.md
  2  
  3  This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
  4  
  5  ## STOP — Read This First (MANDATORY)
  6  
  7  **"Do it the hard way, it's easier."**
  8  
  9  Before writing ANY code, you MUST follow the full BMAD/OpenSpec agentic process:
 10  
 11  1. **Discovery** — Analyze the problem. Read existing specs, architecture, and code. Identify gaps. Produce a discovery brief.
 12  2. **Planning** — Create/update epics and stories with acceptance criteria. Each story must be small enough for one implementation session.
 13  3. **Architecture** — Write ADRs for significant decisions. Update `_bmad/architecture.md`. Consider security, performance, and failure modes.
 14  4. **Design** — For user-facing changes, update UX specs. Consider all states and edge cases.
 15  5. **Generate** — NOW write code. Code must satisfy the spec requirements, not the other way around.
 16  6. **Evaluate** — Test everything. E2E browser automation, WebSocket tests, image recognition for UI. Iterate until ALL acceptance criteria pass. Do NOT declare "complete" until you have evidence.
 17  7. **Reconcile** — Update specs to match reality (with rationale for any divergence). Update traceability, status, changelog.
 18  
 19  **DO NOT skip steps. DO NOT declare victory without testing. DO NOT move fast and leave untested code behind.** Every shortcut creates tech debt that the family depends on. If you find yourself writing code before specs exist, STOP and go back to step 1.
 20  
 21  When using subagents for BMAD roles, wait for their artifacts before proceeding to implementation. The artifacts ARE the guardrails.
 22  
 23  ## Development Workflow (MANDATORY)
 24  
 25  This project uses **spec-anchored development** (BMAD + OpenSpec). Every code change follows:
 26  
 27  1. **Spec First** — Update `openspec/capabilities/*/spec.md` with new REQ-* and SCENARIO-*. Create/update story in `epics/stories/`.
 28  2. **Write Tests** — Tests reference REQ-* and SCENARIO-* in comments.
 29  3. **Implement** — Code to satisfy spec requirements.
 30  4. **Verify** — Run unit tests, type checks, builds per commands below.
 31  5. **E2E Verify (MANDATORY)** — Run end-to-end tests per `ops/e2e-test-plan.md`. All changes derived from user instruction MUST be verified E2E before reporting done. See E2E Testing below.
 32  6. **Reconcile Specs** — Update Implementation Status in spec.md. Update story status. Update `_bmad/traceability.md` impl status column. If implementation diverged from spec, update spec to match reality with rationale.
 33  7. **Update Ops** — Update `ops/status.md` (what's working/next) and `ops/changelog.md` (what you did).
 34  
 35  Never leave specs and code disagreeing silently.
 36  
 37  ### Architecture Freshness Check
 38  
 39  If `_bmad/architecture.md` "Last Reconciled" date is >30 days old, flag to user before starting new capability work.
 40  
 41  ## E2E Testing (MANDATORY)
 42  
 43  **Every change derived from user instruction must be verified end-to-end.** This means:
 44  
 45  - **Web applications**: Browser automation (Playwright or equivalent) against the deployed system
 46  - **Mobile applications**: Mobile emulator testing (mobile-mcp or equivalent)
 47  - **Backend services / drivers**: Integration tests against running system instances with real protocol exchanges
 48  - **DNS resolution**: When testing against running systems, use proper DNS names when feasible (not just localhost)
 49  
 50  E2E tests must exercise the full deployed stack, not just unit tests against mocked dependencies. The test plan lives at `ops/e2e-test-plan.md` and results are documented at `ops/test-results.md`.
 51  
 52  ## Build / Test / Deploy
 53  
 54  ```bash
 55  # {{FILL: build commands}}
 56  # {{FILL: test commands — unit}}
 57  # {{FILL: test commands — E2E}}
 58  # {{FILL: lint/type-check commands}}
 59  # {{FILL: deploy commands}}
 60  ```
 61  
 62  ## Agentic Harness
 63  
 64  This project uses a **context-reset architecture** with discrete BMAD agent roles. No two roles share a context window.
 65  
 66  | BMAD Role | Agent | Context | Config |
 67  |-----------|-------|---------|--------|
 68  | Analyst (Mary) | Discovery | Fresh per analysis | `.harness/prompts/discovery.md` |
 69  | PM (John) | Planner | Fresh per planning cycle | `.harness/prompts/planner.md` |
 70  | Architect (Winston) | Architect | Fresh per design decision | `.harness/prompts/architect.md` |
 71  | UX Designer (Sally) | Design | Fresh per design task | `.harness/prompts/design.md` |
 72  | Developer (Amelia) | Generator | Fresh per story | `.harness/prompts/generator.md` |
 73  | QA (Quinn) | Evaluator | Fresh per evaluation | `.harness/prompts/evaluator.md` |
 74  | Scrum Master (Bob) | Orchestrator | Stateless script | `scripts/orchestrate.py` |
 75  
 76  - **Harness config**: `.harness/config.yaml` (agent models, tools, budgets, evaluation criteria)
 77  - **Handoff artifacts**: `.harness/handoffs/` (YAML state passed between agents)
 78  - **Sprint contracts**: `.harness/contracts/` (measurable criteria binding generator + evaluator)
 79  - **Evaluation reports**: `.harness/evaluations/` (independent QA verdicts)
 80  
 81  See `.harness/prompts/*.md` for each agent's full role definition and `_bmad/workflow.md` for the orchestration loop.
 82  
 83  ## Session Metrics (MANDATORY)
 84  
 85  Track execution time and token consumption every turn:
 86  
 87  1. **Turn start**: Run `date -u +"%Y-%m-%dT%H:%M:%SZ"` at start of each response
 88  2. **Turn end**: Run `date -u +"%Y-%m-%dT%H:%M:%SZ"` right before responding to user
 89  3. **Log both** in `ops/metrics.md` turn log table
 90  4. **Subagent metrics**: Record tokens and duration from agent result metadata
 91  5. **On context compaction or session end**: Run `python3 scripts/session-metrics.py` to extract authoritative token counts and costs from the session JSONL, then update `ops/metrics.md` Session Summary
 92  
 93  ## Build Environment
 94  
 95  <!-- {{FILL: WSL quirks, nvm, toolchain notes}} -->
 96  
 97  ## Key Paths
 98  
 99  | What | Where |
100  |------|-------|
101  | BMAD strategic docs | `_bmad/` |
102  | BMAD agent roles | `_bmad/agents/` |
103  | BMAD workflow | `_bmad/workflow.md` |
104  | Capability specs | `openspec/capabilities/*/spec.md` |
105  | Capability designs | `openspec/capabilities/*/design.md` |
106  | OpenSpec agent guide | `openspec/AGENTS.md` |
107  | Project conventions | `openspec/project.md` |
108  | Change proposals | `openspec/change-proposals/` |
109  | Epics & stories | `epics/` |
110  | Harness config | `.harness/config.yaml` |
111  | Agent prompts | `.harness/prompts/` |
112  | Handoff artifacts | `.harness/handoffs/` |
113  | Sprint contracts | `.harness/contracts/` |
114  | Evaluation reports | `.harness/evaluations/` |
115  | Orchestration loop | `scripts/orchestrate.py` |
116  | Operational status | `ops/status.md` |
117  | Server & credentials | `ops/server.md` |
118  | Work log | `ops/changelog.md` |
119  | Known issues | `ops/known-issues.md` |
120  | E2E test plan | `ops/e2e-test-plan.md` |
121  | Test results | `ops/test-results.md` |
122  
123  ## When to Read Deeper
124  
125  - **Before starting a new capability**: Read the relevant `openspec/capabilities/*/spec.md` and `design.md`
126  - **Before deploying or debugging server issues**: Read `ops/server.md` and `ops/known-issues.md`
127  - **Before architectural decisions or adding new components**: Read `_bmad/architecture.md`
128  - **To understand project scope or requirements**: Read `_bmad/prd.md`
129  - **To check what's built vs. spec'd**: Read `_bmad/traceability.md` (has implementation status per FR)
130  - **Before reporting work as done**: Read `ops/e2e-test-plan.md` and execute relevant E2E tests