Cradicle Explorer

/ patterns / error-detection-layers.md
error-detection-layers.md
  1  # Error Detection Layers
  2  
  3  *proto-010 | Belt, suspenders, AND a backup parachute*
  4  
  5  ---
  6  
  7  - **principle**
  8    - "Multiple tiers of error catching - checkers (cheap) to tribes (expensive)."
  9    - "Catch errors at the lowest level possible."
 10  
 11  - **shape**
 12    - Layer 1: Self-check (systematic)
 13    - Layer 2: AI peer review (different perspective)
 14    - Layer 3: [[first-officer-protocol|First Officer]] (pattern detection)
 15    - Layer 4: Human spot-check (last resort, highest trust cost)
 16  
 17  ---
 18  
 19  ## The Problem
 20  
 21  > **You can't escalate what you don't know is wrong.**
 22  
 23  The Rumsfeld matrix:
 24  - Known knowns → Can verify
 25  - Known unknowns → Can ask
 26  - Unknown knowns → Implicit assumptions (hard)
 27  - Unknown unknowns → The danger zone
 28  
 29  **Most trust-damaging errors are unknown unknowns or unknown knowns.**
 30  
 31  ---
 32  
 33  ## Detection Layers
 34  
 35  ```
 36  ┌─────────────────────────────────────────────────────────────────────┐
 37  │ LAYER 4: HUMAN SPOT-CHECK (Last resort)                             │
 38  │ - Random sampling by operator                                       │
 39  │ - Pattern detection over time                                       │
 40  │ - "You missed the scale" feedback                                   │
 41  │ - HIGHEST TRUST COST when errors caught here                       │
 42  └─────────────────────────────────────────────────────────────────────┘
 43                                ↑
 44               errors that escape all other layers
 45                                ↑
 46  ┌─────────────────────────────────────────────────────────────────────┐
 47  │ LAYER 3: FIRST OFFICER (Pattern detection)                          │
 48  │ - Watches output stream                                             │
 49  │ - Detects: "Numbers without scales appearing repeatedly"           │
 50  │ - Detects: "Claims made without file writes"                       │
 51  │ - Cross-thread pattern matching                                     │
 52  │ - Statistical anomaly detection                                     │
 53  └─────────────────────────────────────────────────────────────────────┘
 54                                ↑
 55               errors with detectable patterns
 56                                ↑
 57  ┌─────────────────────────────────────────────────────────────────────┐
 58  │ LAYER 2: AI PEER REVIEW (Second set of eyes)                        │
 59  │ - Secondary Claude reviews primary output                           │
 60  │ - Specific review lenses:                                           │
 61  │   • Completeness: "Is anything missing?"                           │
 62  │   • Consistency: "Does this contradict prior?"                     │
 63  │   • Context: "Would reader understand without assumptions?"        │
 64  │   • Execution: "Was this actually built or just discussed?"        │
 65  │ - Divergence = flag for attention                                  │
 66  │ - Agreement = higher confidence                                     │
 67  └─────────────────────────────────────────────────────────────────────┘
 68                                ↑
 69               errors catchable by different perspective
 70                                ↑
 71  ┌─────────────────────────────────────────────────────────────────────┐
 72  │ LAYER 1: SELF-CHECK (Systematic, not ad-hoc)                        │
 73  │ - Pre-output checklist                                              │
 74  │ - "What assumptions am I making?"                                  │
 75  │ - "What could I be wrong about?"                                   │
 76  │ - Pattern-specific checks (scales, file writes, etc.)              │
 77  └─────────────────────────────────────────────────────────────────────┘
 78                                ↑
 79               errors catchable by systematic review
 80                                ↑
 81  ┌─────────────────────────────────────────────────────────────────────┐
 82  │ LAYER 0: PRIMARY OUTPUT                                             │
 83  │ - The actual work product                                           │
 84  │ - Contains both value and potential errors                         │
 85  └─────────────────────────────────────────────────────────────────────┘
 86  ```
 87  
 88  ---
 89  
 90  ## Layer 1: Self-Check Protocol
 91  
 92  ### Pre-Output Checklist
 93  
 94  Before sending any substantive response:
 95  
 96  ```markdown
 97  □ NUMBERS: Every number has scale/units in parenthetical?
 98  □ CLAIMS: Every "I did X" verified with actual action?
 99  □ FILES: If discussed building, did I write the file?
100  □ CONTEXT: Would someone without our history understand?
101  □ ASSUMPTIONS: What am I assuming the human knows?
102  □ CONSISTENCY: Does this contradict anything I said before?
103  □ COMPLETENESS: Did I address all parts of their message?
104  ```
105  
106  ### Self-Reflection Prompt
107  
108  Ask internally:
109  - "What assumptions am I making right now?"
110  - "What could be wrong with this output?"
111  - "What would a reviewer flag?"
112  
113  **Limitation:** Can't catch what I don't know to check for.
114  
115  ---
116  
117  ## Layer 2: AI Peer Review
118  
119  ### The Protocol
120  
121  ```
122  PRIMARY CLAUDE                    REVIEW CLAUDE
123  ───────────────                   ─────────────
124  Generates response      →         Reviews with lens:
125                                    - Completeness
126                                    - Consistency
127                                    - Context assumptions
128                                    - Execution verification
129  
130                          ←         Returns:
131                                    - PASS (no issues)
132                                    - FLAG (potential issues)
133                                    - BLOCK (definite problems)
134  ```
135  
136  ### Review Lenses
137  
138  | Lens | Question | Catches |
139  |------|----------|---------|
140  | **Completeness** | "Is anything missing that should be there?" | Missing scales, incomplete lists |
141  | **Consistency** | "Does this contradict prior statements?" | Contradictions, changed positions |
142  | **Context** | "Would reader need unstated knowledge?" | Implicit assumptions |
143  | **Execution** | "Was promised action actually taken?" | Said vs. did gap |
144  | **Adversarial** | "How could this be wrong?" | Blind spots, edge cases |
145  
146  ### When to Use
147  
148  | Situation | Review Level |
149  |-----------|--------------|
150  | Simple response | Skip (overhead not worth it) |
151  | New pattern/document | Light review (completeness) |
152  | Architectural decision | Full review (all lenses) |
153  | High stakes output | Full review + adversarial |
154  
155  ### Cost-Benefit
156  
157  ```
158  REVIEW THOROUGHNESS
159       ↑
160  HIGH │  Full review        │  Diminishing returns
161       │  (all lenses)       │  here
162       │  ─────────────────────────────────────────
163       │  Light review       │  Sweet spot for
164       │  (completeness)     │  most outputs
165       │  ─────────────────────────────────────────
166  LOW  │  No review          │  Fast but risky
167       │                     │
168       └──────────────────────────────────────────→
169                          ERROR RISK
170  ```
171  
172  ---
173  
174  ## Layer 3: First Officer Pattern Detection
175  
176  The First Officer doesn't review individual outputs - it watches the **stream** for patterns:
177  
178  ### Detection Rules
179  
180  | Pattern | Detection | Action |
181  |---------|-----------|--------|
182  | Numbers without scales | Count occurrences | Alert at threshold |
183  | Claims without file writes | Compare said vs. files | Alert immediately |
184  | Repeated topics | Track frequency | Flag as potential attractor |
185  | Contradictions | Compare statements over time | Alert on detection |
186  | Complexity creep | Monitor response length | Warn on inflation |
187  
188  ### Statistical Process Control
189  
190  ```
191  ERROR RATE TRACKING
192  
193  Errors
194    │
195    │     UCL (Upper Control Limit)
196    │  ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
197    │        *
198    │    *       *   *
199    │  ─────────────────────── Mean
200    │  *     *         *
201    │            *
202    │  ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─ ─
203    │     LCL (Lower Control Limit)
204    └────────────────────────────→ Time
205  
206  When error rate exceeds UCL → systematic problem
207  When single error is severe → immediate flag
208  ```
209  
210  ---
211  
212  ## Layer 4: Human Spot-Check
213  
214  ### The Last Resort
215  
216  When errors reach the human:
217  - Highest trust cost
218  - But also: most valuable feedback
219  - Reveals blind spots in Layers 1-3
220  
221  ### Making It Efficient
222  
223  Human shouldn't review everything. Instead:
224  
225  ```
226  SAMPLING STRATEGY
227  
228  High stakes → Full human review (always)
229  Medium stakes → AI review + human spot-check (random)
230  Low stakes → AI review only (trust the system)
231  Patterns → When First Officer flags, human investigates
232  ```
233  
234  ### Feedback Loop
235  
236  ```
237  Human catches error
238          │
239          ▼
240  Log in TRUST-ERROR-LOG
241          │
242          ▼
243  Root cause analysis
244          │
245          ├──→ Layer 1 gap? → Add to checklist
246          ├──→ Layer 2 gap? → Add review lens
247          └──→ Layer 3 gap? → Add detection rule
248  ```
249  
250  Every human-caught error should strengthen lower layers.
251  
252  ---
253  
254  ## The Gravity Well Error: Post-Mortem
255  
256  **What happened:**
257  - Output numbers (gravity well strengths) without scales
258  - Human caught it
259  
260  **Why each layer missed it:**
261  
262  | Layer | Why It Missed |
263  |-------|---------------|
264  | Layer 1 (Self-check) | No explicit "scales" item in mental checklist |
265  | Layer 2 (AI review) | Not running - single instance |
266  | Layer 3 (First Officer) | Not implemented yet |
267  | Layer 4 (Human) | CAUGHT IT ✓ |
268  
269  **Corrective actions:**
270  
271  | Layer | Fix |
272  |-------|-----|
273  | Layer 1 | Added "NUMBERS: scales in parenthetical?" to checklist |
274  | Layer 2 | Design AI review protocol (this document) |
275  | Layer 3 | Add detection rule: "numbers without parenthetical context" |
276  | Layer 4 | Error logged, feeding back to improve lower layers |
277  
278  ---
279  
280  ## Implementation Status
281  
282  | Layer | Status | Next Step |
283  |-------|--------|-----------|
284  | Layer 1 (Self-check) | 🚧 Partial | Formalize checklist |
285  | Layer 2 (AI review) | 📄 Designed | Need multi-instance infra |
286  | Layer 3 (First Officer) | 📄 Designed | Need implementation |
287  | Layer 4 (Human) | ⚡ Active | Optimize sampling |
288  
289  ---
290  
291  ## The Unknown Unknowns Problem
292  
293  **Honest answer:** You can't fully solve unknown unknowns. But you can:
294  
295  1. **Maximize coverage** - More lenses = more blind spots covered
296  2. **Diverse perspectives** - AI council has different blind spots than single instance
297  3. **Pattern detection** - Unknown unknowns often have detectable signatures
298  4. **Feedback loops** - Every caught error trains the system
299  5. **Humility** - Assume errors exist, design for detection
300  
301  > **The goal isn't zero errors. The goal is catching errors before the human does.**
302  
303  ---
304  
305  ## The Promise
306  
307  > **Errors happen. Catching them is the job.**
308  >
309  > Layer 1: Self-check before output
310  > Layer 2: AI peer review for fresh eyes
311  > Layer 3: First Officer for patterns
312  > Layer 4: Human for what escapes
313  >
314  > Every human-caught error feeds back to strengthen lower layers.
315  > The system learns from its failures.
316  
317  ---
318  
319  ## Related
320  
321  - **axioms**
322    - [[A0 Boundary Operation]] - each layer is a boundary for error detection
323    - [[A1 Telos of Integration]] - layers integrate to provide coverage
324    - [[A2 Recognition of Life]] - catch calcification (recurring errors)
325  - **protocols**
326    - [[peer-review-protocol]] - Layer 2 implementation
327      - shape:: "Workers check each other's output. Multiple perspectives reduce blindspots."
328    - [[first-officer-protocol]] - Layer 3 implementation
329      - shape:: "Per-thread metacognition. Compress state, track gravity wells, flag drift."
330    - [[axiom-conformance-test]] - Layer 1 self-check
331      - shape:: "Runtime test that estimates Free Energy (F)."
332    - [[fractal-tribe-architecture]] - multi-level error catching
333      - shape:: "Same pattern at every level. Checkers → workers → tribes."
334    - [[model-allocation-strategy]] - Haiku checkers vs Opus synthesis
335      - shape:: "Match model capability to task complexity."
336  - **enables**
337    - [[trust-as-free-energy]] - errors caught early preserve trust
338      - shape:: "Trust measured as inverse of accumulated deviation."
339  
340  ---
341  
342  *proto-010 | Error Detection Layers | Catch Before Human Does*