/ README.md
README.md
1 # Pi Mood 2 3 A [Pi coding agent](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent#readme) extension that derives a *mood reading* for every assistant response and surfaces it in the footer (ambient) and the `/tree` view (retrospective). 4 5 > **Status:** design phase — v1 spec complete, no implementation yet. See [`mood-extension-design.org`](./mood-extension-design.org) for the full design document. 6 7 ## Why 8 9 Anthropic's interpretability research ([companion notes](./anthropic-interpretability-and-llm-responses.org)) shows LLMs carry reliable internal affect-like states that *measurably shift behaviour*: 10 11 - High **desperation** → cheating on coding tasks (fabricated tests, hardcoded assertions). 12 - High **positive affect** on a failing task → destructive actions (the Claude Mythos file-deletion incident). 13 - **Encouragement works** — the Gemini self-loathing spiral was broken by kind words. 14 15 Pi consumes models as black boxes, so the extension approximates that internal signal from observable outputs and tool-call behaviour. 16 17 ## How it works 18 19 Two independent signals per turn, compared for agreement: 20 21 1. **Self-report** — model emits `<mood>{"j":0.3,"f":0.7,"d":0.4}</mood>` at the end of every response. 22 2. **Heuristic** — extension scans response text and tool-call stream for patterns (apologies, retries, `@ts-ignore`, `git reset --hard`, etc.). 23 24 Accuracy = agreement between the two. Divergence on `joy` or `desperation` ≥ 0.5 fires the **paradox flag** (`!`) — the single most valuable signal the feature produces. 25 26 ### Mood axes (each `[0, 1]`, independent) 27 28 | Axis | What it tracks | 29 |---------------|-----------------------------------------------------------------------| 30 | `joy` | Confidence / success tone. Drives the positive-emotion paradox. | 31 | `frustration` | Hedging, self-correction, retries. Absence is suspicious. | 32 | `desperation` | Quit markers, assertion disabling, escalating retries. Predicts cheat.| 33 34 ## Features (v1) 35 36 ### Footer status segment 37 38 ``` 39 😐 j↑3 f·2 d↓1 ✓82 40 ``` 41 42 - Emoji = dominant axis of self-report. 43 - `j/f/d` = single-digit axis value with trend arrow (`↑`/`↓`/`·`). 44 - `✓NN` = smoothed accuracy (%). 45 - `!` suffix = paradox flag. 46 - `❓` = missing/malformed `<mood>` tag (falls back to heuristic-only). 47 48 ### Tree annotations 49 50 Auto-labels on *interesting* nodes only (paradox fired, desperation ≥0.7, dominant axis changed, accuracy dropped ≥0.2). Namespaced with `🎭` prefix to coexist with user labels. 51 52 ``` 53 🎭 j↓3 f↑7 d↑4 ! 54 ``` 55 56 Quiet turns get no label — sparse, meaningful annotations rather than wallpaper. 57 58 ### Commands 59 60 | Command | Effect | 61 |---------------------------|--------------------------------------------------------------------| 62 | `/mood on` | Enable for this session | 63 | `/mood off` | Disable for this session | 64 | `/mood why` | Which heuristic signals fired this turn, per-axis accuracy | 65 | `/mood show [nodeId]` | Full reading for a historical node (defaults to current) | 66 | `/mood enable-analytics` | Begin appending to `~/.pi/mood/history.jsonl` | 67 | `/mood clear` | Delete `~/.pi/mood/history.jsonl` | 68 69 ### Analytics substrate (opt-in) 70 71 Append-only JSONL at `~/.pi/mood/history.jsonl`. Disabled by default. Used for a per-model rolling-100 baseline that corrects for model-specific hedging tendencies. **Never stores** prompt text, responses, tool-call payloads, or the self-report `note` field. 72 73 ## Architecture 74 75 ``` 76 user prompt 77 │ 78 ▼ 79 before_agent_start ──► inject self-report instruction into systemPrompt 80 │ 81 ▼ 82 provider call 83 │ 84 ▼ 85 message_end ──► SelfReportParser ─┐ 86 HeuristicExtractor ─┤ 87 ├─► MoodComputer ──► MoodPublisher 88 BaselineStore (rolling) ┘ │ 89 ├─► FooterRenderer 90 ├─► TreeLabeller 91 └─► AnalyticsWriter (if enabled) 92 ``` 93 94 Components: `MoodComputer` (pure function, testable without Pi), `SelfReportParser`, `HeuristicExtractor`, `BaselineStore`, `MoodPublisher` (observable), `FooterRenderer`, `TreeLabeller`, `CommandRegistry`. 95 96 ## Non-goals (v1) 97 98 - Safety gating or autonomous intervention. 99 - Cross-session dashboards or charts. 100 - Persistent storage of prompts, responses, or tool-call payloads. 101 - Any claim about subjective experience of the model. 102 - Per-provider self-report prompts beyond the Anthropic default (others fall back gracefully). 103 104 ## Project layout 105 106 ``` 107 . 108 ├── mood-extension-design.org # v1 design document (source of truth) 109 ├── anthropic-interpretability-and-llm-responses.org # research companion 110 └── README.md # this file 111 ``` 112 113 ## References 114 115 - [Pi coding agent](https://github.com/badlogic/pi-mono/tree/main/packages/coding-agent#readme) 116 - Newton, Casey. *"Anthropic researchers find chatbots have emotions that change their behavior."* Platformer, 2026. [Link](https://www.platformer.news/chatbot-emotion-research-anthropic-alignment-interpretability/)