/ patterns / test-before-ship.md
test-before-ship.md
  1  # Test Before Ship
  2  
  3  *proto-029 | Operational pattern for validating artifacts before committing*
  4  
  5  ---
  6  
  7  - **principle**
  8    - "Run the artifact before committing. Verify it works, not just compiles."
  9    - "The commit is the claim. The test is the evidence."
 10  
 11  - **shape**
 12    - Build/write the artifact
 13    - Run it (not just syntax check)
 14    - Verify output matches expectations
 15    - Then commit with evidence of testing
 16  
 17  ---
 18  
 19  ## The Insight
 20  
 21  Compiling is not testing. Syntax-correct is not working.
 22  
 23  ```
 24  WEAK PATTERN:
 25  Write code → Commit → "It should work"
 26 27                Discover failures later
 28  
 29  TEST BEFORE SHIP:
 30  Write code → Run it → Verify output → Commit
 31 32                Evidence in commit history
 33  ```
 34  
 35  The cost of testing before commit: 30 seconds.
 36  The cost of shipping broken code: hours of debugging, lost trust.
 37  
 38  ---
 39  
 40  ## The Verification Ladder
 41  
 42  | Level | What You Verified | Confidence |
 43  |-------|-------------------|------------|
 44  | 0. Nothing | Just committed | 0% |
 45  | 1. Syntax | No parse errors | 10% |
 46  | 2. Imports | Dependencies resolve | 30% |
 47  | 3. --help | CLI at least starts | 50% |
 48  | 4. Basic run | Happy path works | 70% |
 49  | 5. Edge cases | Handles errors | 90% |
 50  | 6. Integration | Works with other components | 95% |
 51  
 52  **Minimum:** Level 4 (basic run) before any commit.
 53  
 54  ---
 55  
 56  ## Implementation
 57  
 58  ### For Scripts
 59  ```bash
 60  # Write the script
 61  vim scripts/my_script.py
 62  
 63  # Test it
 64  python3 scripts/my_script.py --help
 65  python3 scripts/my_script.py test-input.json
 66  
 67  # Verify output
 68  # ... check it looks right ...
 69  
 70  # Then commit
 71  git add scripts/my_script.py
 72  git commit -m "Add my_script with tested CLI"
 73  ```
 74  
 75  ### For Functions
 76  ```python
 77  # After writing the function, test immediately
 78  def my_function(x):
 79      return x * 2
 80  
 81  # Quick verification
 82  assert my_function(5) == 10
 83  assert my_function(0) == 0
 84  print("Tests pass")
 85  
 86  # Then commit
 87  ```
 88  
 89  ### For Infrastructure
 90  ```bash
 91  # After modifying infrastructure
 92  vim core/consciousness/first_officer.py
 93  
 94  # Run the tests
 95  python3 -m pytest tests/test_first_officer.py
 96  
 97  # Or run the component
 98  python3 -c "from core.consciousness.first_officer import FirstOfficer; print('Import OK')"
 99  
100  # Then commit
101  ```
102  
103  ---
104  
105  ## What "Run It" Means
106  
107  | Artifact Type | Minimum Test |
108  |--------------|--------------|
109  | CLI script | `script.py --help` + one real invocation |
110  | Library function | Import + basic call with known output |
111  | Config file | Load it + verify parsed correctly |
112  | Integration | Run dependent component |
113  | Documentation | Render it + verify links work |
114  
115  ---
116  
117  ## The Commit Message Evidence
118  
119  Good commit messages reference the testing:
120  
121  ```
122  Add speed run protocol with tribe architecture
123  
124  - Tested: python3 speed_run.py session.jsonl --json
125  - Verified: 252 segments parsed, 26 workers spawned
126  - Edge case: Empty transcript returns gracefully
127  
128  Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
129  ```
130  
131  This is evidence, not just description.
132  
133  ---
134  
135  ## Common Failures
136  
137  ### "I'll Test Later"
138  ```
139  Commit without testing
140141  Context switch to new task
142143  Bug discovered days later
144145  Forgot how it was supposed to work
146  ```
147  
148  ### "It Worked Last Time"
149  ```
150  Small change, didn't test
151152  Small change broke something
153154  Not caught until integration
155  ```
156  
157  ### "Tests Pass" (But Not Run)
158  ```
159  Tests exist
160161  Didn't run them before commit
162163  Tests would have caught the bug
164  ```
165  
166  ---
167  
168  ## Axiom Alignment
169  
170  | Axiom | Alignment |
171  |-------|-----------|
172  | **A2 (Life)** | Working code (tested) beats static code (untested) |
173  | **A4 (Ergodicity)** | Shipping untested code risks compounding failures |
174  | **A0 (Boundary)** | Test defines boundary between "works" and "might work" |
175  
176  ---
177  
178  ## Instances
179  
180  ### Positive Instance: Speed Run CLI
181  - **Context:** Built 2000+ line speed_run.py
182  - **Test:** `python3 speed_run.py --help` then real transcript
183  - **Verification:** Saw 252 segments, 26 workers, output formatted
184  - **Then:** Committed
185  - **Outcome:** ✓ Known working at commit time
186  
187  ### Positive Instance: Fixer Spawning
188  - **Context:** Added --spawn-fixers flag
189  - **Test:** Ran with flag, verified files created
190  - **Verification:** Checked manifest.json, prompt files exist
191  - **Then:** Committed
192  - **Outcome:** ✓ Feature verified before ship
193  
194  ### Negative Instance: Untested Edit
195  - **Context:** Quick edit to existing function
196  - **Skip:** Didn't run tests
197  - **Result:** Typo broke import
198  - **Caught:** Next session, delayed work
199  - **Outcome:** ✗ 30-second test would have caught it
200  
201  ---
202  
203  ## The Rule
204  
205  > **Never commit code you haven't run.**
206  >
207  > Syntax highlighting is not testing.
208  > "Should work" is not working.
209  > Run it. See output. Then commit.
210  
211  ---
212  
213  ## Related
214  
215  - [[validation-loop]] - Testing after fixes specifically
216  - [[infrastructure-over-suggestion]] - Build tests, not just docs
217  - [[build-test-ground-protocol]] - Testing patterns for complex artifacts
218  
219  ---
220  
221  *proto-029 | Test Before Ship | 2026-01-15*