model-allocation-strategy.md
1 # Model Allocation Strategy 2 3 *proto-012 | Match model capability to task complexity* 4 5 --- 6 7 - **principle** 8 - "Match model capability to task complexity. Haiku for simple, Sonnet for medium, Opus for judgment." 9 10 - **shape** 11 - Not every task needs your best model 12 - Degrade intelligently based on task concreteness 13 - Object-level work degrades (Haiku, Sonnet); metacognition requires Opus 14 - The pyramid catches more than the peak alone 15 16 --- 17 18 **Status:** π DOCUMENTED 19 20 --- 21 22 ## Core Principle 23 24 > **Not every task needs your best model. Degrade intelligently based on task concreteness.** 25 26 Using Opus for everything is: 27 - Expensive (10x+ cost) 28 - Slow (higher latency) 29 - Wasteful (overkill for routine tasks) 30 31 Using Haiku for everything is: 32 - Cheap but misses nuance 33 - Fast but shallow 34 - Risky for complex judgment 35 36 **The solution:** A pyramid of capability matching task complexity. 37 38 --- 39 40 ## The Model Pyramid 41 42 ``` 43 OPUS 44 βββββββββββββββ 45 β Synthesis β Cost: $$$ 46 β Novel β Latency: High 47 β Adversarial β Instances: 1-2 48 ββββββββ¬βββββββ 49 β 50 SONNET 51 ββββββββββββββΌβββββββββββββ 52 β β β 53 ββββββ΄βββββ ββββββ΄βββββ ββββββ΄βββββ 54 βSubstan- β βSubstan- β βSubstan- β Cost: $$ 55 βtive β βtive β βtive β Latency: Medium 56 βReview β βReview β βReview β Instances: 3-5 57 ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ 58 β β β 59 HAIKU 60 βββββββββββΌβββββββββββββΌβββββββββββββΌββββββββββ 61 β β β β β 62 βββββ΄ββββ βββββ΄ββββ ββββββββ΄ββββ βββββββ΄ββββ βββββ΄ββββ 63 βRoutineβ βRoutineβ β Routine β β Routine β βRoutineβ Cost: $ 64 βCheck β βCheck β β Check β β Check β βCheck β Latency: Low 65 βββββββββ βββββββββ ββββββββββββ βββββββββββ βββββββββ Instances: Many 66 ``` 67 68 --- 69 70 ## Task-to-Model Mapping 71 72 ### Opus (Highest Capability) 73 74 | Task | Why Opus | 75 |------|----------| 76 | **First Officer synthesis** | Requires seeing patterns across multiple inputs | 77 | **Adversarial review** | Needs creativity to find non-obvious failures | 78 | **Architectural decisions** | Novel problem-solving, high stakes | 79 | **Axiom-level questions** | Philosophical nuance, edge cases | 80 | **Deadlock resolution** | When council can't converge | 81 82 **Use when:** Novel, ambiguous, high-stakes, requires synthesis 83 84 ### Sonnet (Balanced) 85 86 | Task | Why Sonnet | 87 |------|------------| 88 | **Substantive peer review** | Needs judgment but not synthesis | 89 | **Documentation writing** | Quality matters but path is clear | 90 | **Consistency checking** | Thorough comparison, moderate complexity | 91 | **Pattern implementation** | Following established templates | 92 | **Standard council member** | Deliberation without meta-synthesis | 93 94 **Use when:** Substantive judgment needed, established patterns exist 95 96 ### Haiku (Fastest/Cheapest) 97 98 | Task | Why Haiku | 99 |------|-----------| 100 | **Checklist validation** | Binary checks, no judgment needed | 101 | **Format verification** | Does file have required sections? | 102 | **Simple completeness** | Are all fields filled? | 103 | **High-volume parallel** | Many quick checks simultaneously | 104 | **Routine status updates** | Mechanical, predictable | 105 106 **Use when:** Concrete, mechanical, high-volume, low-stakes 107 108 --- 109 110 ## Tribal Review Configuration 111 112 ### Standard Review (Most Cases) 113 ``` 114 Reviewers: 3x Sonnet (completeness, consistency, context) 115 First Officer: Skip (Sonnet findings usually sufficient) 116 Cost: $$ 117 ``` 118 119 ### Important Review (Significant Changes) 120 ``` 121 Reviewers: 3x Sonnet (completeness, consistency, adversarial) 122 First Officer: 1x Opus (synthesis) 123 Cost: $$$ 124 ``` 125 126 ### Critical Review (Architectural/Axiom-level) 127 ``` 128 Pre-check: 5x Haiku (basic validation) 129 Reviewers: 3x Sonnet (all lenses) 130 First Officer: 1x Opus (synthesis + recommendations) 131 Arbitration: 1x Opus if deadlock 132 Cost: $$$$ 133 ``` 134 135 ### High-Volume Review (Many Small Items) 136 ``` 137 Reviewers: 10x Haiku (parallel checklist) 138 Escalation: 1x Sonnet if Haiku flags issues 139 First Officer: Skip unless escalation 140 Cost: $ 141 ``` 142 143 --- 144 145 ## Degradation Rules 146 147 ### Degrade DOWN when: 148 - Task is well-defined 149 - Checklist can capture requirements 150 - Pattern/template exists 151 - Low stakes if wrong 152 - Speed matters more than depth 153 154 ### Escalate UP when: 155 - Haiku/Sonnet finds unexpected issues 156 - Task requires novel judgment 157 - Stakes are high 158 - Multiple valid interpretations 159 - Synthesis across inputs needed 160 161 --- 162 163 ## Cost-Benefit Analysis 164 165 | Configuration | Cost | Coverage | Best For | 166 |---------------|------|----------|----------| 167 | 1x Opus | $$$ | 75% | Quick expert opinion | 168 | 3x Haiku | $ | 50% | High-volume screening | 169 | 3x Sonnet | $$ | 80% | Standard review | 170 | 3x Sonnet + 1x Opus | $$$ | 90% | Important decisions | 171 | 5x Haiku + 3x Sonnet + 1x Opus | $$$$ | 95%+ | Critical architecture | 172 173 **The insight:** 3x Sonnet often matches 1x Opus for detection, but Opus adds synthesis value that Sonnet can't provide. 174 175 --- 176 177 ## Integration with Peer Review Protocol 178 179 Update `proto-011` escalation ladder: 180 181 ``` 182 LEVEL 1: SINGLE REVIEWER 183 Model: Sonnet 184 Catches: ~70% 185 186 LEVEL 2: DUAL REVIEWER 187 Model: 2x Sonnet 188 Catches: ~80% 189 190 LEVEL 3: TRIBAL REVIEW 191 Model: 3x Sonnet + 1x Opus (First Officer) 192 Catches: ~90% 193 194 LEVEL 4: HUMAN + COUNCIL 195 Model: 3x Sonnet + 1x Opus + Human 196 Catches: ~95%+ 197 ``` 198 199 --- 200 201 ## The First Officer is Always Opus 202 203 **Why:** Opus does metacognition. Sonnet does cognition. 204 205 ``` 206 COGNITION (Object Level) METACOGNITION (Meta Level) 207 ββββββββββββββββββββββββ βββββββββββββββββββββββββ 208 "What are the issues?" "Are these findings valid?" 209 "Does this match?" "Are reviewers converging?" 210 "How could this fail?" "What's the pattern?" 211 "Which reviewer is reliable?" 212 213 Sonnet can do this Only Opus can do this 214 ``` 215 216 **The architectural principle:** 217 - Object-level work can degrade (Haiku, Sonnet) 218 - Metacognition requires Opus 219 - This isn't about "hard tasks" - it's about **level of abstraction** 220 221 **Exception:** Skip First Officer entirely for routine reviews (save cost). 222 223 --- 224 225 ## Implementation 226 227 ### Spawn with Model Parameter 228 ``` 229 Task agent with model: "haiku" | "sonnet" | "opus" 230 ``` 231 232 ### Parallel Haiku Swarm 233 ``` 234 Spawn 5-10 Haiku agents simultaneously 235 Each checks one aspect 236 Aggregate findings 237 Escalate anomalies to Sonnet 238 ``` 239 240 ### Sequential Escalation 241 ``` 242 Haiku pass β Done (no issues) 243 Haiku flag β Sonnet review 244 Sonnet flag β Opus synthesis 245 Opus flag β Human decision 246 ``` 247 248 --- 249 250 ## The Promise 251 252 > **Intelligence should match the problem, not exceed it.** 253 > 254 > Haiku for checklists. 255 > Sonnet for judgment. 256 > Opus for synthesis. 257 > 258 > Spend capability where it matters. 259 > Degrade gracefully on routine work. 260 > The pyramid catches more than the peak alone. 261 262 --- 263 264 ## Related 265 266 - **axioms** 267 - [[A2 Recognition of Life]] - recognize which model is "alive" for this task 268 - shape:: "Can you recognize life? Death mimics life through ornament." 269 - [[A3 Dynamic Pole Navigation]] - navigate between cheap/capable based on context 270 - shape:: "Life is the oscillation; death is fixing at either pole." 271 - **protocols** 272 - [[first-officer-protocol]] - always Opus (metacognition) 273 - shape:: "Per-thread metacognition. Compress state, track gravity wells, flag drift." 274 - [[fractal-tribe-architecture]] - model allocation at each level 275 - shape:: "Same pattern at every level. Checkers β workers β tribes β supervisors." 276 - [[peer-review-protocol]] - which models review what 277 - shape:: "Workers check each other's output before it goes to user." 278 - [[error-detection-layers]] - Haiku checkers before Sonnet workers 279 - shape:: "Multiple tiers of error catching. Catch errors at lowest level possible." 280 - **enables** 281 - [[autonomous-exploration-tribes]] - tribe worker model selection 282 283 --- 284 285 *proto-012 | Model Allocation Strategy | Match Capability to Complexity*