Cradicle Explorer

/ CLAUDE.md
CLAUDE.md
  1  # CLAUDE.md — Agent Coding Guidelines
  2  
  3  Python 3.11+ · Ruff · mypy --strict · pytest · uv
  4  
  5  ---
  6  
  7  ## Architecture
  8  
  9  ```
 10  cli/           → entry points only (arg parsing, logging setup)
 11  pipeline/      → orchestration: calls services in sequence, no business logic
 12  services/      → domain logic (coaching, scoring, outreach strategy)
 13  integrations/  → one file per external API (brightdata.py, notion.py, anthropic.py)
 14  models/        → Pydantic schemas only, no logic
 15  prompts/       → LLM prompt templates (.txt / .jinja2), not inline f-strings
 16  config.py      → pydantic-settings backed by .env
 17  exceptions.py  → custom exception hierarchy
 18  ```
 19  
 20  **Layer rules:**
 21  - Services depend on Protocols, never on integration classes directly.
 22  - No API/Notion logic inside scoring or coaching logic.
 23  - No business logic in CLI files.
 24  - One class = one responsibility. >5 unrelated public methods → split it.
 25  
 26  **Pre-push hook location:** `.githooks/pre-push` (not `.git/hooks/`).
 27  Activated via `git config core.hooksPath .githooks`.
 28  
 29  **Bandit path maintenance:** When adding a new top-level module, add it to
 30  the bandit path list in `.githooks/pre-push`. Bandit does not support
 31  `targets` in `pyproject.toml`, so paths are specified as CLI arguments.
 32  
 33  **Vulture whitelist:** When adding Pydantic models with `model_config`,
 34  `@field_validator`/`@model_validator`, or new Protocol methods, check
 35  `uv run vulture`. If it flags false positives, add entries to
 36  `vulture_whitelist.py`.
 37  
 38  ---
 39  
 40  ## Configuration & Secrets
 41  
 42  - All config lives in `config.py` using `pydantic-settings` + `.env`.
 43  - No hardcoded URLs, model names, API keys, sleep durations, thresholds, or magic numbers.
 44  - Magic numbers → named constants with a comment explaining the value.
 45  - Validate all env vars at startup. Missing required var → raise `ConfigurationError` with the var name.
 46  - All required env vars documented in `.env.example`.
 47  
 48  ```python
 49  class Settings(BaseSettings):
 50      anthropic_api_key: str
 51      brightdata_token: str
 52      notion_token: str | None = None
 53      llm_model: str = "claude-sonnet-4-6"
 54      polling_interval_seconds: int = 15
 55  
 56      class Config:
 57          env_file = ".env"
 58  
 59  settings = Settings()
 60  ```
 61  
 62  ---
 63  
 64  ## Error Handling & Exceptions
 65  
 66  - Catch specific exceptions only — no bare `except:`.
 67  - No silent failures. If you can't recover: log at ERROR + raise a domain exception.
 68  - Custom exceptions live in `exceptions.py` (e.g., `APIError`, `ConfigurationError`, `ParseError`).
 69  - Validate external data (API responses, JSON) before accessing keys — use `.get()` or parse via Pydantic.
 70  - No `sys.exit()` in library code — only in CLI entry points.
 71  
 72  ---
 73  
 74  ## Pydantic Models
 75  
 76  - One canonical definition per model in `models/`, split by domain (`models/job.py`, `models/profile.py`).
 77  - Never redefine a model in a second place. Never use bare `dict` when a model exists.
 78  - Use `model_validator` / `field_validator` for invariants.
 79  - Use `ConfigDict(frozen=True)` for value objects.
 80  - Use `@dataclass` for internal data containers that don't cross I/O boundaries.
 81  
 82  ---
 83  
 84  ## External API Clients
 85  
 86  - Each API gets its own class in `integrations/`, implementing a Protocol for testability.
 87  - Explicit timeouts on all HTTP calls — never open-ended.
 88  - Retry transient failures (5xx, timeout) with `tenacity` exponential backoff. 4xx = permanent → don't retry.
 89  - Log request method + URL at DEBUG; status code at INFO.
 90  - Never log request/response bodies at INFO+ (may contain secrets or PII).
 91  
 92  ---
 93  
 94  ## LLM Provider
 95  
 96  All providers implement a common Protocol:
 97  
 98  ```python
 99  class LLMProvider(Protocol):
100      def complete(self, system: str, user: str, *, temperature: float = 0.0, seed: int | None = None) -> str: ...
101  ```
102  
103  - Model name, temperature, max tokens come from `settings` — never hardcoded.
104  - Wrap calls in a retry decorator for transient errors.
105  - Prompt templates live in `prompts/` — not as inline strings in business logic.
106  
107  ---
108  
109  ## Project-Specific Conventions
110  
111  - **Language**: English only in all code, comments, docstrings. No French.
112  - **No emojis** in code, comments, or logs.
113  - **No commented-out code** — delete it, use git history.
114  - **Docstrings**: Google style. Include Args, Returns, Raises for non-trivial functions.
115  - **`__all__`**: Define in every public module.
116  - **Async**: Pick one model per pipeline (full async or thread-pool). Don't mix.
117  - **Cache directories**: Version them (`cache/v1/`) so schema changes don't corrupt old data.
118  - **Dependencies**: `pyproject.toml` + `uv`. Dev tools in `[dependency-groups]`.
119  
120  ---
121  
122  ## Tests
123  
124  - **Test names must convey intent**: Name describes *what behaviour is verified and why it matters*, not just the method called. Prefer `test_error_entries_excluded_for_rescoring` over `test_error_entries`.
125  - **Rationale comments on every non-obvious test**: A one-line `#` comment above the test body explaining *why* this test exists — what real-world scenario or edge case it guards against. Trivial happy-path tests (e.g., "returns expected value") may omit the comment if the name is self-explanatory.
126  - **Class and module docstrings must reflect full scope**: If a test class covers both happy path and error cases, the name and docstring must say so — not just "error handling".
127  - **No network, no API keys**: Unit tests must run without secrets. Required env vars are set in `tests/conftest.py` with dummy values.
128  
129  ---
130  
131  ## Agent Workflow Rules
132  
133  These are mandatory behavioral rules for AI agents working on this codebase.
134  
135  ### Spec Required
136  Do not implement any feature without an approved spec in `specs/active/`. Read the spec first, break the work into atomic steps, then implement.
137  
138  ### Check Learnings
139  Before starting any task, read `LEARNINGS.md` for known pitfalls and past mistakes.
140  
141  ### Pattern Sweep
142  When a review finds one instance of a defect class (bare except, hardcoded value, missing annotation), search **all files in scope** for other occurrences before proposing any fix.
143  
144  ### List Before Fixing
145  Report all discovered occurrences with file:line references **before** writing code:
146  ```
147  Found 4 occurrences of bare except:
148  - services/coach.py:91
149  - services/coach.py:112
150  - integrations/brightdata.py:205
151  - cli/main.py:368
152  ```
153  
154  ### Permission Gate
155  After listing occurrences, **ask for explicit approval** before applying grouped fixes. No silent batch changes.
156  
157  ### Minimal Diff
158  Fix only what was asked. Don't refactor surrounding code, rename variables, add docstrings, or clean up style in unrelated lines. Smallest possible diff.
159  
160  ### No Speculative Changes
161  Don't add error handling, logging, validation, or abstractions for scenarios not present in the code or explicitly requested.
162  
163  ### Spec Naming Convention
164  Spec files follow the pattern `YYYY_MM_DD-NN_name.md` (e.g., `2026_03_12-01_score_jobs.md`).
165  The `-NN` suffix (01, 02, …) handles multiple specs created on the same day.
166  
167  ### Archive Spec on Merge
168  The last commit on every feature branch must move the spec from `specs/active/` to `specs/archived/`. No spec should remain in `specs/active/` after its PR is merged.
169  
170  ---
171  
172  ## Journal Guidelines
173  
174  * **Location:** Use a single rolling file at `docs/journal.md`.
175  * **Session Index:** Add a one-line summary with the date (YYYY-MM-DD).
176  * **Session Entries:** Append new sessions at the end using the template fields: Goal, Steps, Results, Metrics, Decisions, Issues, Next, Artifacts.
177  * **Artifacts:** Link with repo-relative paths only.
178  * **Brevity:** Keep entries concise and factual.
179  * **Process Exception:** Updating `docs/journal.md` does not require an approved spec (e.g., brainstorming or session notes). All other code/config changes still require an approved spec.