Cradicle Explorer

/ YackShavingSkill_PRD_v2.md
YackShavingSkill_PRD_v2.md
  1  # PRD: `@YackShavingSkill` for Pi Coding Agent
  2  **Version:** 2.0 (Context & Reflection Integrated)
  3  **Status:** Ready for Implementation
  4  **Primary Goal:** Transform LLM coding guidelines from passive text into an **active operational system** that utilizes the Pi agent's multi-agent workflow.
  5  
  6  ---
  7  
  8  ## 1. Concept
  9  **The LLM Counterweight.**
 10  LLM agents suffer from "induced complexity"—a tendency toward over-engineering, scope creep, and "feature bloat" unless actively restrained.
 11  
 12  While a standard `CLAUDE.md` offers *advice*, `@YackShavingSkill` provides **architecture**. It forces the agent to:
 13  1.  **Reflect:** Use a "Session Journal" to externalize its thinking before and after coding.
 14  2.  **Reference:** Link to concrete code examples to ground its "style" in reality, not abstraction.
 15  
 16  ---
 17  
 18  ## 2. The 3-Layer Defense System
 19  We move beyond "advice" by operationalizing the skill into three active layers:
 20  
 21  ### Layer 1: The Protocol (The Rules)
 22  The four core behavioral principles derived from Karpathy, transformed into strict execution rules.
 23  1.  **Think Before Coding:** State assumptions + Confidence Score.
 24  2.  **Simplicity First:** If 200 lines could be 50, the task fails.
 25  3.  **Surgical Changes:** Touch only what is absolutely necessary.
 26  4.  **Goal-Driven Execution:** Define pass/fail criteria *before* the first line is written.
 27  
 28  ### Layer 2: The Context (The Patterns)
 29  **The `examples/` Directory.**
 30  Instead of describing "good code" in text, we point the agent to concrete patterns it must mimic. This anchors the LLM's behavior to a "Gold Standard."
 31  *   *Mechanism:* The skill prompts are dynamically injected with relative links (e.g., "Read `examples/patterns/simple-loop.md`").
 32  
 33  ### Layer 3: The Reflection (The Process)
 34  **The Session Journal.**
 35  A structured log file (e.g., `SESSION_LOG.md`) where the agent must "think out loud" in structured steps. This makes complexity visible *while* it's happening, rather than surprising the user at the finish line.
 36  
 37  ---
 38  
 39  ## 3. Pi Workflow Integration
 40  
 41  ### Phase 1: The Plan (The Orchestrator)
 42  **Trigger:** User initiates a request via `pi_messenger({ action: "plan", ... })`
 43  
 44  1.  **Complexity Scoring:** The Plan Agent calculates a score (1-10).
 45      *   *1-3 (Trivial):* Goal-Driven only.
 46      *   *4-7 (Standard):* All 4 Skills + Session Journal.
 47      *   *8-10 (Complex):* All 4 Skills + Session Journal + Expert Reviewer Agent.
 48  2.  **Task Injection:** The Plan Agent wraps the actual work in the `@YackShavingSkill` templates.
 49  
 50  ### Phase 2: The Work (The Reflex)
 51  **Trigger:** The Worker Agent begins the `work` phase.
 52  
 53  1.  **Reflection (Pre-Flight):** The agent **must** write "Journal Entry #1" before touching files. It defines its "Simplicity Strategy" and points to the `examples/` pattern it is mimicking.
 54  2.  **Execution:** The agent performs the work, strictly adhering to the "Surgical Boundaries" declared in the journal.
 55  3.  **Reflection (Post-Flight):** The agent **must** write "Journal Entry #2," comparing the result against the pre-flight commitment (e.g., "Target: 20 lines. Result: 20 lines. Status: PASS").
 56  
 57  ### Phase 3: The Review (The Gate)
 58  **Trigger:** The Reviewer Agent (or Plan Agent) checks the output.
 59  
 60  The Reviewer looks at the **Session Journal** *first*.
 61  *   **Violation Check:** If the agent committed to "Minimal Complexity" but the code is bloated, the review fails instantly.
 62  *   **Pattern Check:** The Reviewer verifies the agent actually cited and followed one of the `examples/` patterns.
 63  
 64  ---
 65  
 66  ## 4. The Skill Artifacts (Prompt Templates)
 67  
 68  These specific templates will be injected into the `task.spec` files.
 69  
 70  ### Template A: The Reflection Journal (Pre-Flight)
 71  *This is what the agent writes BEFORE coding to establish a "Reflex."*
 72  
 73  ```markdown
 74  ## YackShaving Journal: Pre-Flight
 75  **Task:** [User Request]
 76  **Strategy:** [Minimal / Standard / Deep]
 77  
 78  ### 1. The "Reflex" Check
 79  > **Simplicity Goal:** "I will solve this using [Simple Technique]. I will NOT use [Over-Engineered Approach]."
 80  >
 81  > **Scope Boundaries:**
 82  > - **In-Scope:** [File A, File B]
 83  > - **Out-of-Scope:** [Database Schema, Configuration]. *Note: I am strictly forbidding myself from touching these.*
 84  
 85  ### 2. Contextual Retrieval
 86  > *I am currently reading `examples/patterns/example-pattern.md` to ensure I match the project style.*
 87  
 88  ### 3. Assumptions & Trade-offs
 89  - Assumption: [X is true]. Confidence: [High/Med/Low].
 90  - Trade-off: I chose approach [A] over [B] because [Reason].
 91  ```
 92  
 93  ### Template B: The Review Gate (Post-Flight)
 94  *This is what the agent writes AFTER coding to enable self-correction.*
 95  
 96  ```markdown
 97  ## YackShaving Journal: Post-Flight
 98  **Reflex Audit:** [Passed / Failed]
 99  
100  ### Violations
101  - [ ] **Complexity Creep:** (Did I add unused flags or hidden logic?)
102  - [ ] **Scope Bleed:** (Did I touch Out-of-Scope files?)
103  - [ ] **Style Drift:** (Did I mimic the `examples/` structure correctly?)
104  
105  ### Verification Matrix
106  - **Goal:** [Test A] -> [Result: Pass/Fail]
107  - **Goal:** [Test B] -> [Result: Pass/Fail]
108  ```
109  
110  ## 5. Repository Structure
111  
112  We keep this modular so the skills are reusable and the "Context" (examples) is easily expanded.
113  
114  ```text
115  @YackShavingSkill/
116  ├── PRD.md                   # This file
117  ├── CLAUDE.md                # The top-level "master" rules (shorthand access)
118  │
119  ├── skills/                  # The modular prompt layers
120  │   ├── think.md
121  │   ├── simplicity.md
122  │   ├── surgical.md
123  │   └── goal.md
124  │
125  ├── examples/                # The "Context Brain" (CRITICAL)
126  │   ├── anti-patterns/       # What to AVOID
127  │   │   ├── bloated-loop.md
128  │   │   └── god-object.md
129  │   └── patterns/            # What to MIMIC
130  │       ├── simple-loop.md
131  │       └── surgical-diff.md
132  │
133  └── templates/               # The reflection structures
134      └── session_journal.md
135  ```
136  
137  ## 6. Success Metrics
138  
139  We don't just hope it works; we measure it via the **Session Journal**:
140  
141  1.  **Reflex Rate:** The % of tasks where the agent catches its own bloat during the "Post-Flight" entry (Target: >80%)
142  2.  **Style Drift:** The % of tasks that successfully reference and mimic a pattern from the `examples/` directory (Target: 100%)
143  3.  **Surgical Purity:** The % of code changes that are strictly within the "Scope Boundaries" declared in the Pre-Flight journal (Target: >95%)
144  
145  ## 7. Implementation Roadmap
146  
147  1.  **Week 1: The Core.** Define the 4 skill modules and the `CLAUDE.md` master file.
148  2.  **Week 2: The Context.** Create the `examples/` repository (patterns and anti-patterns) and the `templates/` directory.
149  3.  **Week 3: The Mirror.** Test the "Journal" templates with a Pi agent to ensure it can generate structured reflections without hallucinating.
150  4.  **Week 4: The Gate.** Integrate the "Post-Flight" reflection into the Reviewer Agent's workflow.
151  
152  ---
153  
154  *"This framework doesn't try to make the model code faster. It makes the model **specify and review better**, which is where the actual bottleneck lies."*