Cradicle Explorer

/ infra / human / CSPEC-2026-001_Claude_CI_Integration.md
CSPEC-2026-001_Claude_CI_Integration.md
  1  # CSPEC: Claude CI Integration for Alpha/Delta Protocol
  2  
  3  ## Implementation Plan Document
  4  
  5  **CSPEC ID:** CSPEC-2026-001  
  6  **Feature:** Claude API Integration for CI Pipeline  
  7  **Status:** PLANNED  
  8  **Priority:** P2  
  9  **Estimated Effort:** 2-3 sessions  
 10  **Created:** January 2026  
 11  **Target Implementation:** Q1 2026
 12  
 13  ---
 14  
 15  ## 1. Executive Summary
 16  
 17  Implement an AI-powered code review and validation system that integrates Claude (via Anthropic API) into the Alpha/Delta Protocol CI pipeline. The system will maintain full project context from the `alpha-delta-context` repository to provide informed, protocol-aware analysis of all code changes.
 18  
 19  ### 1.1 Business Value
 20  
 21  - **Automated Architectural Enforcement**: Ensure all changes align with Technical Specification 3.0
 22  - **Security-First Review**: Protocol-aware security analysis for privacy-preserving and cross-chain operations
 23  - **Documentation Consistency**: Automatic detection of spec/code drift
 24  - **Reduced Review Burden**: AI handles routine checks, humans focus on design decisions
 25  
 26  ### 1.2 Success Criteria
 27  
 28  - [ ] All PRs receive automated Claude review within 5 minutes
 29  - [ ] Security findings have <5% false positive rate
 30  - [ ] Architecture violations are caught before human review
 31  - [ ] Integration adds <$100/month to CI costs
 32  
 33  ---
 34  
 35  ## 2. Technical Context
 36  
 37  ### 2.1 Current State
 38  
 39  ```
 40  ┌─────────────────────────────────────────────────────────┐
 41  │                 Current CI Pipeline                      │
 42  ├─────────────────────────────────────────────────────────┤
 43  │  PR Opened → Lint → Test → Build → Human Review → Merge │
 44  │                                         ▲                │
 45  │                                         │                │
 46  │                              (Manual, time-consuming)    │
 47  └─────────────────────────────────────────────────────────┘
 48  ```
 49  
 50  ### 2.2 Target State
 51  
 52  ```
 53  ┌──────────────────────────────────────────────────────────────────────┐
 54  │                      Enhanced CI Pipeline                             │
 55  ├──────────────────────────────────────────────────────────────────────┤
 56  │                                                                       │
 57  │  PR Opened ─┬─► Lint ──────────────────────────┬─► Human Review      │
 58  │             ├─► Test ──────────────────────────┤   (informed by      │
 59  │             ├─► Build ─────────────────────────┤    Claude analysis) │
 60  │             │                                   │         │           │
 61  │             └─► Claude Review ─┬─► PR Comment ─┘         ▼           │
 62  │                                ├─► Security Gate         Merge       │
 63  │                                └─► Arch Validation                   │
 64  │                                         ▲                             │
 65  │                                         │                             │
 66  │                    ┌────────────────────┴───────────────────┐        │
 67  │                    │     alpha-delta-context Repository     │        │
 68  │                    │  (Full project context: specs, ADRs,   │        │
 69  │                    │   governance, security requirements)   │        │
 70  │                    └────────────────────────────────────────┘        │
 71  └──────────────────────────────────────────────────────────────────────┘
 72  ```
 73  
 74  ### 2.3 Dependencies
 75  
 76  | Dependency | Version | Purpose |
 77  |------------|---------|---------|
 78  | Anthropic API | 2024-10+ | Claude access |
 79  | Python | 3.10+ | Integration script runtime |
 80  | Forgejo Actions | Latest | CI orchestration |
 81  | alpha-delta-context | main | Project context source |
 82  
 83  ### 2.4 Related Systems
 84  
 85  - **Forgejo Instance**: `ci.yourdomain.com`
 86  - **Runner Server**: 32-core build server (native execution)
 87  - **DO Spaces**: S3-compatible storage for artifacts
 88  - **Radicle**: P2P backup (sync workflow exists)
 89  
 90  ---
 91  
 92  ## 3. Implementation Phases
 93  
 94  ### Phase 1: Core Infrastructure (Session 1)
 95  
 96  **Objective:** Establish the foundational integration components.
 97  
 98  #### 3.1.1 Tasks
 99  
100  ```
101  □ Create directory structure
102    □ /opt/ci/tools/claude-ci/
103    □ alpha-delta-context/tools/
104    
105  □ Implement ContextLoader class
106    □ Pattern-based file discovery
107    □ Content categorization (specs, arch, governance, security, api, decisions)
108    □ Token estimation and truncation handling
109    □ Caching mechanism for repeated calls
110    
111  □ Implement CIConfig dataclass
112    □ Environment variable loading
113    □ Sensible defaults
114    □ Validation of required fields
115    
116  □ Create basic CLI structure
117    □ argparse setup
118    □ Subcommand routing
119    □ Error handling framework
120  ```
121  
122  #### 3.1.2 Files to Create
123  
124  | File | Location | Purpose |
125  |------|----------|---------|
126  | `claude_ci.py` | `alpha-delta-context/tools/` | Main integration script |
127  | `requirements.txt` | `alpha-delta-context/tools/` | Python dependencies |
128  | `config.example.yaml` | `alpha-delta-context/tools/` | Example configuration |
129  
130  #### 3.1.3 Verification
131  
132  ```bash
133  # Test context loading
134  python3 claude_ci.py context-info
135  
136  # Expected output:
137  # 📋 Context Repository Information
138  #    Path: /opt/ci/workspaces/alpha-delta-context
139  #    Loaded 24 context files (~45,000 tokens)
140  #    Files loaded:
141  #    - specifications/technical-spec-v3.md (52,340 chars)
142  #    - architecture/overview.md (8,230 chars)
143  #    ...
144  ```
145  
146  ---
147  
148  ### Phase 2: Claude API Integration (Session 1-2)
149  
150  **Objective:** Implement the core API client with review capabilities.
151  
152  #### 3.2.1 Tasks
153  
154  ```
155  □ Implement ClaudeCIClient class
156    □ Anthropic client initialization
157    □ Context prompt builder (lazy-loaded)
158    □ System prompt templates per task type
159    
160  □ Implement review_pull_request() method
161    □ Structured prompt for PR review
162    □ JSON response parsing with fallback
163    □ Recommendation extraction (APPROVE/REQUEST_CHANGES/COMMENT)
164    
165  □ Implement validate_architecture() method
166    □ Git commit introspection
167    □ Spec reference extraction
168    □ Violation severity classification
169    
170  □ Implement security_review() method
171    □ Focus area configuration
172    □ CWE ID mapping where applicable
173    □ Merge-blocking logic for critical findings
174    
175  □ Implement sync_documentation() method
176    □ Code-to-docs drift detection
177    □ Auto-generated update suggestions
178    □ Changelog entry generation
179  ```
180  
181  #### 3.2.2 Prompt Engineering Notes
182  
183  **System Prompt Structure:**
184  ```
185  1. Role definition (CI assistant with protocol expertise)
186  2. Capability enumeration (Rust, blockchain, ZK, security)
187  3. Task-specific instructions
188  4. Full context injection (specs, arch, governance)
189  5. Output format specification (JSON schema)
190  ```
191  
192  **Critical Context Files to Always Include:**
193  - `specifications/technical-spec-v3.md` — Primary source of truth
194  - `governance/rules.md` — Central Bank authority, validator requirements
195  - `security/requirements.md` — Privacy and cryptographic requirements
196  - `architecture/cross-chain.md` — ALPHA/DELTA bridge specifications
197  
198  #### 3.2.3 API Configuration
199  
200  ```python
201  # Recommended settings
202  MODEL_PR_REVIEW = "claude-sonnet-4-20250514"      # Fast, cost-effective
203  MODEL_SECURITY = "claude-sonnet-4-20250514"       # Good security analysis
204  MODEL_ARCHITECTURE = "claude-opus-4-20250514"     # Deep reasoning for arch
205  MAX_TOKENS = 8192                                  # Sufficient for detailed reviews
206  MAX_CONTEXT_TOKENS = 150000                        # Leave room for response
207  ```
208  
209  #### 3.2.4 Verification
210  
211  ```bash
212  # Test PR review (requires API key)
213  export ANTHROPIC_API_KEY="sk-..."
214  python3 claude_ci.py review --pr 1 
215  
216  # Test security review
217  git diff HEAD~1 > /tmp/test.diff
218  python3 claude_ci.py security-review --diff /tmp/test.diff
219  
220  # Test architecture validation
221  python3 claude_ci.py validate-arch --commit $(git rev-parse HEAD)
222  ```
223  
224  ---
225  
226  ### Phase 3: Forgejo Integration (Session 2)
227  
228  **Objective:** Create CI workflows and Forgejo API integration.
229  
230  #### 3.3.1 Tasks
231  
232  ```
233  □ Implement Forgejo API helpers
234    □ get_pr_diff() — Fetch PR diff via API
235    □ get_pr_files() — List changed files
236    □ post_review_comment() — Post review as PR comment
237    □ set_commit_status() — Update CI status checks
238    
239  □ Create primary workflow (claude-review.yml)
240    □ Trigger configuration (PR events)
241    □ Context repository setup job
242    □ Parallel review jobs (PR, security, arch, docs)
243    □ Summary job with GitHub Actions step summary
244    
245  □ Create workflow secrets documentation
246    □ ANTHROPIC_API_KEY
247    □ FORGEJO_TOKEN
248    
249  □ Implement CI status integration
250    □ Map review outcomes to CI pass/fail
251    □ Security gate for critical findings
252    □ Architecture gate for spec violations
253  ```
254  
255  #### 3.3.2 Workflow Structure
256  
257  ```yaml
258  # .forgejo/workflows/claude-review.yml
259  
260  jobs:
261    setup-context:        # Clone/update context repo
262      runs-on: native
263      
264    claude-review:        # Full PR review
265      needs: setup-context
266      runs-on: native
267      
268    security-review:      # Security-focused analysis
269      needs: setup-context
270      runs-on: native
271      
272    architecture-validation:  # Spec compliance
273      needs: setup-context
274      runs-on: native
275      
276    docs-sync:            # Documentation drift check
277      needs: setup-context
278      runs-on: native
279      
280    review-summary:       # Aggregate results
281      needs: [claude-review, security-review, architecture-validation, docs-sync]
282      runs-on: native
283  ```
284  
285  #### 3.3.3 Verification
286  
287  ```bash
288  # Create test PR
289  git checkout -b test/claude-ci-integration
290  echo "// test" >> src/lib.rs
291  git commit -am "test: trigger Claude CI"
292  git push origin test/claude-ci-integration
293  
294  # Open PR via Forgejo UI
295  # Verify workflow triggers and completes
296  # Check PR comments for Claude review
297  ```
298  
299  ---
300  
301  ### Phase 4: Advanced Features (Session 3)
302  
303  **Objective:** Implement enhancement features and hardening.
304  
305  #### 3.4.1 Tasks
306  
307  ```
308  □ Implement implementation suggestion generator
309    □ suggest command for new features
310    □ Code skeleton generation
311    □ Test strategy recommendations
312    
313  □ Add caching layer
314    □ Cache context prompt (invalidate on context repo changes)
315    □ Cache API responses for identical diffs (short TTL)
316    
317  □ Implement rate limiting
318    □ Per-PR limits (avoid runaway costs)
319    □ Backoff on API errors
320    
321  □ Add observability
322    □ Token usage logging
323    □ Response time metrics
324    □ Cost tracking per review type
325    
326  □ Create manual trigger workflow
327    □ workflow_dispatch for on-demand reviews
328    □ Review type selection (full/security/arch)
329    
330  □ Implement merge gate integration
331    □ Required status check configuration
332    □ Override mechanism for maintainers
333  ```
334  
335  #### 3.4.2 Cost Controls
336  
337  ```python
338  # Implement in claude_ci.py
339  
340  class CostTracker:
341      """Track and limit API costs."""
342      
343      PRICE_PER_1K_INPUT = {
344          "claude-sonnet-4-20250514": 0.003,
345          "claude-opus-4-20250514": 0.015,
346      }
347      
348      PRICE_PER_1K_OUTPUT = {
349          "claude-sonnet-4-20250514": 0.015,
350          "claude-opus-4-20250514": 0.075,
351      }
352      
353      MAX_COST_PER_PR = 1.00  # $1 max per PR
354      MAX_COST_PER_DAY = 20.00  # $20 max per day
355  ```
356  
357  #### 3.4.3 Verification
358  
359  ```bash
360  # Test rate limiting
361  for i in {1..10}; do
362    python3 claude_ci.py review --pr 1
363  done
364  # Should see rate limit warnings after threshold
365  
366  # Test cost tracking
367  python3 claude_ci.py stats --period day
368  # Expected: Token usage and cost breakdown
369  ```
370  
371  ---
372  
373  ## 4. File Manifest
374  
375  ### 4.1 Files to Create
376  
377  | File | Repository | Description |
378  |------|------------|-------------|
379  | `tools/claude_ci.py` | alpha-delta-context | Main integration script |
380  | `tools/requirements.txt` | alpha-delta-context | Python dependencies |
381  | `tools/config.example.yaml` | alpha-delta-context | Configuration template |
382  | `.forgejo/workflows/claude-review.yml` | alpha-delta-protocol | CI workflow |
383  | `docs/ci/CLAUDE_INTEGRATION.md` | alpha-delta-context | Setup documentation |
384  
385  ### 4.2 Files to Modify
386  
387  | File | Repository | Changes |
388  |------|------------|---------|
389  | `README.md` | alpha-delta-context | Add Claude CI section |
390  | `.forgejo/workflows/ci.yml` | alpha-delta-protocol | Add Claude job dependencies (optional) |
391  
392  ### 4.3 Secrets to Configure
393  
394  | Secret | Scope | Description |
395  |--------|-------|-------------|
396  | `ANTHROPIC_API_KEY` | Repository | Anthropic API key |
397  | `FORGEJO_TOKEN` | Repository | Forgejo PAT for comments |
398  
399  ---
400  
401  ## 5. Test Plan
402  
403  ### 5.1 Unit Tests
404  
405  ```python
406  # tests/test_context_loader.py
407  
408  def test_context_loader_finds_spec_files():
409      loader = ContextLoader(config)
410      files = loader.load_context_files()
411      assert "specifications/technical-spec-v3.md" in files
412  
413  def test_context_prompt_includes_all_sections():
414      loader = ContextLoader(config)
415      prompt = loader.build_context_prompt()
416      assert "## Technical Specifications" in prompt
417      assert "## Governance Rules" in prompt
418  
419  def test_context_respects_token_limit():
420      config = CIConfig(max_context_tokens=1000)
421      loader = ContextLoader(config)
422      prompt = loader.build_context_prompt()
423      assert len(prompt) // 4 < 1000  # Rough token estimate
424  ```
425  
426  ### 5.2 Integration Tests
427  
428  ```bash
429  # tests/integration/test_full_review.sh
430  
431  #!/bin/bash
432  set -e
433  
434  # Setup
435  export ANTHROPIC_API_KEY="${TEST_API_KEY}"
436  export CONTEXT_REPO_PATH="./test-context"
437  
438  # Create minimal context
439  mkdir -p test-context/specifications
440  echo "# Test Spec" > test-context/specifications/test.md
441  
442  # Create test diff
443  cat > /tmp/test.diff << 'EOF'
444  diff --git a/src/lib.rs b/src/lib.rs
445  index abc123..def456 100644
446  --- a/src/lib.rs
447  +++ b/src/lib.rs
448  @@ -1,3 +1,5 @@
449  +// Added new function
450  +fn new_validator_logic() {}
451  EOF
452  
453  # Run review
454  python3 claude_ci.py security-review --diff /tmp/test.diff
455  
456  # Verify output
457  # Should complete without error and return JSON
458  ```
459  
460  ### 5.3 End-to-End Tests
461  
462  ```
463  □ Create test PR with known security issue
464    □ Verify security review catches it
465    □ Verify PR comment is posted
466    □ Verify CI status is set correctly
467  
468  □ Create test PR with architecture violation
469    □ Verify architecture validation catches it
470    □ Verify spec section is referenced
471  
472  □ Create test PR requiring doc updates
473    □ Verify docs-sync identifies needed updates
474    □ Verify suggested content is reasonable
475  ```
476  
477  ---
478  
479  ## 6. Rollout Plan
480  
481  ### 6.1 Phase 1: Shadow Mode (Week 1)
482  
483  - Deploy integration with `post-comment: false`
484  - Log all reviews to artifacts only
485  - Monitor token usage and costs
486  - Tune prompts based on output quality
487  
488  ### 6.2 Phase 2: Advisory Mode (Week 2-3)
489  
490  - Enable PR comments
491  - Reviews are informational only (don't block merges)
492  - Gather team feedback on review quality
493  - Adjust sensitivity thresholds
494  
495  ### 6.3 Phase 3: Enforcement Mode (Week 4+)
496  
497  - Enable security gate for critical findings
498  - Enable architecture gate for spec violations
499  - Configure as required status check
500  - Document override procedures
501  
502  ---
503  
504  ## 7. Operational Considerations
505  
506  ### 7.1 Cost Management
507  
508  | Review Type | Est. Cost | Frequency | Monthly Est. |
509  |-------------|-----------|-----------|--------------|
510  | PR Review | $0.17 | 80/month | $13.60 |
511  | Security | $0.10 | 80/month | $8.00 |
512  | Architecture | $0.19 | 80/month | $15.20 |
513  | Docs Sync | $0.13 | 80/month | $10.40 |
514  | **Total** | | | **~$47/month** |
515  
516  ### 7.2 Failure Modes
517  
518  | Failure | Impact | Mitigation |
519  |---------|--------|------------|
520  | API unavailable | Reviews skip | `continue-on-error: true` in workflow |
521  | Rate limited | Delayed reviews | Exponential backoff, queue |
522  | Context repo unavailable | Reviews lack context | Cache last-known-good context |
523  | Malformed response | Parse error | Fallback to raw output, manual review |
524  
525  ### 7.3 Monitoring
526  
527  ```yaml
528  # Alerts to configure
529  - name: Claude CI API Errors
530    condition: error_rate > 5% over 1h
531    
532  - name: Claude CI Cost Spike
533    condition: daily_cost > $30
534    
535  - name: Claude CI Latency
536    condition: p95_latency > 120s
537  ```
538  
539  ---
540  
541  ## 8. Security Considerations
542  
543  ### 8.1 API Key Protection
544  
545  - Store in Forgejo secrets (encrypted at rest)
546  - Never log or expose in outputs
547  - Rotate quarterly
548  
549  ### 8.2 Context Sensitivity
550  
551  - Context repo may contain sensitive architectural details
552  - Ensure context repo has appropriate access controls
553  - Consider redacting sensitive sections for lower-privilege reviews
554  
555  ### 8.3 Output Sanitization
556  
557  - Review Claude's output before posting publicly
558  - Strip any accidentally leaked secrets
559  - Validate JSON structure before parsing
560  
561  ---
562  
563  ## 9. Documentation Requirements
564  
565  ### 9.1 User Documentation
566  
567  - [ ] Setup guide for new repositories
568  - [ ] Configuration options reference
569  - [ ] Troubleshooting guide
570  - [ ] FAQ
571  
572  ### 9.2 Developer Documentation
573  
574  - [ ] Architecture overview
575  - [ ] API client reference
576  - [ ] Extending with custom review types
577  - [ ] Contributing guidelines
578  
579  ### 9.3 Operations Documentation
580  
581  - [ ] Runbook for common issues
582  - [ ] Cost monitoring procedures
583  - [ ] Incident response for false positives
584  
585  ---
586  
587  ## 10. Implementation Commands
588  
589  When implementing this feature, use the following sequence:
590  
591  ```bash
592  # Session 1: Core Infrastructure
593  claude-code "Implement Phase 1 of CSPEC-2026-001: Create ContextLoader class 
594  and CLI structure in alpha-delta-context/tools/claude_ci.py. Follow the 
595  specification in the CSPEC document."
596  
597  # Session 1-2: API Integration  
598  claude-code "Implement Phase 2 of CSPEC-2026-001: Add ClaudeCIClient class
599  with review_pull_request, validate_architecture, security_review, and
600  sync_documentation methods. Include structured JSON output parsing."
601  
602  # Session 2: Forgejo Integration
603  claude-code "Implement Phase 3 of CSPEC-2026-001: Create Forgejo workflow
604  at .forgejo/workflows/claude-review.yml with parallel review jobs and
605  PR comment posting."
606  
607  # Session 3: Advanced Features
608  claude-code "Implement Phase 4 of CSPEC-2026-001: Add caching, rate limiting,
609  cost tracking, and manual trigger workflow."
610  ```
611  
612  ---
613  
614  ## 11. Reference Implementation
615  
616  The following files contain a reference implementation that can be used as a starting point:
617  
618  - `claude_ci.py` — Full Python implementation
619  - `claude-review.yml` — Forgejo workflow
620  - `CLAUDE_CI_SETUP.md` — Setup documentation
621  
622  These files were generated during planning and should be reviewed/adapted during implementation.
623  
624  ---
625  
626  ## 12. Acceptance Criteria
627  
628  ### 12.1 Functional Requirements
629  
630  - [ ] PRs automatically receive Claude review comments
631  - [ ] Security findings are categorized by severity
632  - [ ] Architecture violations reference specific spec sections
633  - [ ] Documentation drift is detected and reported
634  - [ ] Manual trigger workflow is available
635  
636  ### 12.2 Non-Functional Requirements
637  
638  - [ ] Reviews complete within 5 minutes
639  - [ ] API errors don't block CI pipeline
640  - [ ] Costs stay under $100/month at 80 PRs/month
641  - [ ] False positive rate < 10%
642  
643  ### 12.3 Documentation Requirements
644  
645  - [ ] Setup guide is complete and tested
646  - [ ] All configuration options are documented
647  - [ ] Troubleshooting guide covers common issues
648  
649  ---
650  
651  **CSPEC Status:** READY FOR IMPLEMENTATION
652  
653  **Assignee:** Claude-Code (future session)
654  
655  **Reviewer:** Marco
656  
657  **Approval:** Pending
658  
659  ---
660  
661  *This CSPEC follows the Alpha/Delta Protocol documentation standards and is designed for implementation by Claude-Code with minimal human intervention.*