/ YackShavingSkill_PRD_v2.md
YackShavingSkill_PRD_v2.md
1 # PRD: `@YackShavingSkill` for Pi Coding Agent 2 **Version:** 2.0 (Context & Reflection Integrated) 3 **Status:** Ready for Implementation 4 **Primary Goal:** Transform LLM coding guidelines from passive text into an **active operational system** that utilizes the Pi agent's multi-agent workflow. 5 6 --- 7 8 ## 1. Concept 9 **The LLM Counterweight.** 10 LLM agents suffer from "induced complexity"—a tendency toward over-engineering, scope creep, and "feature bloat" unless actively restrained. 11 12 While a standard `CLAUDE.md` offers *advice*, `@YackShavingSkill` provides **architecture**. It forces the agent to: 13 1. **Reflect:** Use a "Session Journal" to externalize its thinking before and after coding. 14 2. **Reference:** Link to concrete code examples to ground its "style" in reality, not abstraction. 15 16 --- 17 18 ## 2. The 3-Layer Defense System 19 We move beyond "advice" by operationalizing the skill into three active layers: 20 21 ### Layer 1: The Protocol (The Rules) 22 The four core behavioral principles derived from Karpathy, transformed into strict execution rules. 23 1. **Think Before Coding:** State assumptions + Confidence Score. 24 2. **Simplicity First:** If 200 lines could be 50, the task fails. 25 3. **Surgical Changes:** Touch only what is absolutely necessary. 26 4. **Goal-Driven Execution:** Define pass/fail criteria *before* the first line is written. 27 28 ### Layer 2: The Context (The Patterns) 29 **The `examples/` Directory.** 30 Instead of describing "good code" in text, we point the agent to concrete patterns it must mimic. This anchors the LLM's behavior to a "Gold Standard." 31 * *Mechanism:* The skill prompts are dynamically injected with relative links (e.g., "Read `examples/patterns/simple-loop.md`"). 32 33 ### Layer 3: The Reflection (The Process) 34 **The Session Journal.** 35 A structured log file (e.g., `SESSION_LOG.md`) where the agent must "think out loud" in structured steps. This makes complexity visible *while* it's happening, rather than surprising the user at the finish line. 36 37 --- 38 39 ## 3. Pi Workflow Integration 40 41 ### Phase 1: The Plan (The Orchestrator) 42 **Trigger:** User initiates a request via `pi_messenger({ action: "plan", ... })` 43 44 1. **Complexity Scoring:** The Plan Agent calculates a score (1-10). 45 * *1-3 (Trivial):* Goal-Driven only. 46 * *4-7 (Standard):* All 4 Skills + Session Journal. 47 * *8-10 (Complex):* All 4 Skills + Session Journal + Expert Reviewer Agent. 48 2. **Task Injection:** The Plan Agent wraps the actual work in the `@YackShavingSkill` templates. 49 50 ### Phase 2: The Work (The Reflex) 51 **Trigger:** The Worker Agent begins the `work` phase. 52 53 1. **Reflection (Pre-Flight):** The agent **must** write "Journal Entry #1" before touching files. It defines its "Simplicity Strategy" and points to the `examples/` pattern it is mimicking. 54 2. **Execution:** The agent performs the work, strictly adhering to the "Surgical Boundaries" declared in the journal. 55 3. **Reflection (Post-Flight):** The agent **must** write "Journal Entry #2," comparing the result against the pre-flight commitment (e.g., "Target: 20 lines. Result: 20 lines. Status: PASS"). 56 57 ### Phase 3: The Review (The Gate) 58 **Trigger:** The Reviewer Agent (or Plan Agent) checks the output. 59 60 The Reviewer looks at the **Session Journal** *first*. 61 * **Violation Check:** If the agent committed to "Minimal Complexity" but the code is bloated, the review fails instantly. 62 * **Pattern Check:** The Reviewer verifies the agent actually cited and followed one of the `examples/` patterns. 63 64 --- 65 66 ## 4. The Skill Artifacts (Prompt Templates) 67 68 These specific templates will be injected into the `task.spec` files. 69 70 ### Template A: The Reflection Journal (Pre-Flight) 71 *This is what the agent writes BEFORE coding to establish a "Reflex."* 72 73 ```markdown 74 ## YackShaving Journal: Pre-Flight 75 **Task:** [User Request] 76 **Strategy:** [Minimal / Standard / Deep] 77 78 ### 1. The "Reflex" Check 79 > **Simplicity Goal:** "I will solve this using [Simple Technique]. I will NOT use [Over-Engineered Approach]." 80 > 81 > **Scope Boundaries:** 82 > - **In-Scope:** [File A, File B] 83 > - **Out-of-Scope:** [Database Schema, Configuration]. *Note: I am strictly forbidding myself from touching these.* 84 85 ### 2. Contextual Retrieval 86 > *I am currently reading `examples/patterns/example-pattern.md` to ensure I match the project style.* 87 88 ### 3. Assumptions & Trade-offs 89 - Assumption: [X is true]. Confidence: [High/Med/Low]. 90 - Trade-off: I chose approach [A] over [B] because [Reason]. 91 ``` 92 93 ### Template B: The Review Gate (Post-Flight) 94 *This is what the agent writes AFTER coding to enable self-correction.* 95 96 ```markdown 97 ## YackShaving Journal: Post-Flight 98 **Reflex Audit:** [Passed / Failed] 99 100 ### Violations 101 - [ ] **Complexity Creep:** (Did I add unused flags or hidden logic?) 102 - [ ] **Scope Bleed:** (Did I touch Out-of-Scope files?) 103 - [ ] **Style Drift:** (Did I mimic the `examples/` structure correctly?) 104 105 ### Verification Matrix 106 - **Goal:** [Test A] -> [Result: Pass/Fail] 107 - **Goal:** [Test B] -> [Result: Pass/Fail] 108 ``` 109 110 ## 5. Repository Structure 111 112 We keep this modular so the skills are reusable and the "Context" (examples) is easily expanded. 113 114 ```text 115 @YackShavingSkill/ 116 ├── PRD.md # This file 117 ├── CLAUDE.md # The top-level "master" rules (shorthand access) 118 │ 119 ├── skills/ # The modular prompt layers 120 │ ├── think.md 121 │ ├── simplicity.md 122 │ ├── surgical.md 123 │ └── goal.md 124 │ 125 ├── examples/ # The "Context Brain" (CRITICAL) 126 │ ├── anti-patterns/ # What to AVOID 127 │ │ ├── bloated-loop.md 128 │ │ └── god-object.md 129 │ └── patterns/ # What to MIMIC 130 │ ├── simple-loop.md 131 │ └── surgical-diff.md 132 │ 133 └── templates/ # The reflection structures 134 └── session_journal.md 135 ``` 136 137 ## 6. Success Metrics 138 139 We don't just hope it works; we measure it via the **Session Journal**: 140 141 1. **Reflex Rate:** The % of tasks where the agent catches its own bloat during the "Post-Flight" entry (Target: >80%) 142 2. **Style Drift:** The % of tasks that successfully reference and mimic a pattern from the `examples/` directory (Target: 100%) 143 3. **Surgical Purity:** The % of code changes that are strictly within the "Scope Boundaries" declared in the Pre-Flight journal (Target: >95%) 144 145 ## 7. Implementation Roadmap 146 147 1. **Week 1: The Core.** Define the 4 skill modules and the `CLAUDE.md` master file. 148 2. **Week 2: The Context.** Create the `examples/` repository (patterns and anti-patterns) and the `templates/` directory. 149 3. **Week 3: The Mirror.** Test the "Journal" templates with a Pi agent to ensure it can generate structured reflections without hallucinating. 150 4. **Week 4: The Gate.** Integrate the "Post-Flight" reflection into the Reviewer Agent's workflow. 151 152 --- 153 154 *"This framework doesn't try to make the model code faster. It makes the model **specify and review better**, which is where the actual bottleneck lies."*