CSPEC-2026-001_Claude_CI_Integration.md
1 # CSPEC: Claude CI Integration for Alpha/Delta Protocol 2 3 ## Implementation Plan Document 4 5 **CSPEC ID:** CSPEC-2026-001 6 **Feature:** Claude API Integration for CI Pipeline 7 **Status:** PLANNED 8 **Priority:** P2 9 **Estimated Effort:** 2-3 sessions 10 **Created:** January 2026 11 **Target Implementation:** Q1 2026 12 13 --- 14 15 ## 1. Executive Summary 16 17 Implement an AI-powered code review and validation system that integrates Claude (via Anthropic API) into the Alpha/Delta Protocol CI pipeline. The system will maintain full project context from the `alpha-delta-context` repository to provide informed, protocol-aware analysis of all code changes. 18 19 ### 1.1 Business Value 20 21 - **Automated Architectural Enforcement**: Ensure all changes align with Technical Specification 3.0 22 - **Security-First Review**: Protocol-aware security analysis for privacy-preserving and cross-chain operations 23 - **Documentation Consistency**: Automatic detection of spec/code drift 24 - **Reduced Review Burden**: AI handles routine checks, humans focus on design decisions 25 26 ### 1.2 Success Criteria 27 28 - [ ] All PRs receive automated Claude review within 5 minutes 29 - [ ] Security findings have <5% false positive rate 30 - [ ] Architecture violations are caught before human review 31 - [ ] Integration adds <$100/month to CI costs 32 33 --- 34 35 ## 2. Technical Context 36 37 ### 2.1 Current State 38 39 ``` 40 ┌─────────────────────────────────────────────────────────┐ 41 │ Current CI Pipeline │ 42 ├─────────────────────────────────────────────────────────┤ 43 │ PR Opened → Lint → Test → Build → Human Review → Merge │ 44 │ ▲ │ 45 │ │ │ 46 │ (Manual, time-consuming) │ 47 └─────────────────────────────────────────────────────────┘ 48 ``` 49 50 ### 2.2 Target State 51 52 ``` 53 ┌──────────────────────────────────────────────────────────────────────┐ 54 │ Enhanced CI Pipeline │ 55 ├──────────────────────────────────────────────────────────────────────┤ 56 │ │ 57 │ PR Opened ─┬─► Lint ──────────────────────────┬─► Human Review │ 58 │ ├─► Test ──────────────────────────┤ (informed by │ 59 │ ├─► Build ─────────────────────────┤ Claude analysis) │ 60 │ │ │ │ │ 61 │ └─► Claude Review ─┬─► PR Comment ─┘ ▼ │ 62 │ ├─► Security Gate Merge │ 63 │ └─► Arch Validation │ 64 │ ▲ │ 65 │ │ │ 66 │ ┌────────────────────┴───────────────────┐ │ 67 │ │ alpha-delta-context Repository │ │ 68 │ │ (Full project context: specs, ADRs, │ │ 69 │ │ governance, security requirements) │ │ 70 │ └────────────────────────────────────────┘ │ 71 └──────────────────────────────────────────────────────────────────────┘ 72 ``` 73 74 ### 2.3 Dependencies 75 76 | Dependency | Version | Purpose | 77 |------------|---------|---------| 78 | Anthropic API | 2024-10+ | Claude access | 79 | Python | 3.10+ | Integration script runtime | 80 | Forgejo Actions | Latest | CI orchestration | 81 | alpha-delta-context | main | Project context source | 82 83 ### 2.4 Related Systems 84 85 - **Forgejo Instance**: `ci.yourdomain.com` 86 - **Runner Server**: 32-core build server (native execution) 87 - **DO Spaces**: S3-compatible storage for artifacts 88 - **Radicle**: P2P backup (sync workflow exists) 89 90 --- 91 92 ## 3. Implementation Phases 93 94 ### Phase 1: Core Infrastructure (Session 1) 95 96 **Objective:** Establish the foundational integration components. 97 98 #### 3.1.1 Tasks 99 100 ``` 101 □ Create directory structure 102 □ /opt/ci/tools/claude-ci/ 103 □ alpha-delta-context/tools/ 104 105 □ Implement ContextLoader class 106 □ Pattern-based file discovery 107 □ Content categorization (specs, arch, governance, security, api, decisions) 108 □ Token estimation and truncation handling 109 □ Caching mechanism for repeated calls 110 111 □ Implement CIConfig dataclass 112 □ Environment variable loading 113 □ Sensible defaults 114 □ Validation of required fields 115 116 □ Create basic CLI structure 117 □ argparse setup 118 □ Subcommand routing 119 □ Error handling framework 120 ``` 121 122 #### 3.1.2 Files to Create 123 124 | File | Location | Purpose | 125 |------|----------|---------| 126 | `claude_ci.py` | `alpha-delta-context/tools/` | Main integration script | 127 | `requirements.txt` | `alpha-delta-context/tools/` | Python dependencies | 128 | `config.example.yaml` | `alpha-delta-context/tools/` | Example configuration | 129 130 #### 3.1.3 Verification 131 132 ```bash 133 # Test context loading 134 python3 claude_ci.py context-info 135 136 # Expected output: 137 # 📋 Context Repository Information 138 # Path: /opt/ci/workspaces/alpha-delta-context 139 # Loaded 24 context files (~45,000 tokens) 140 # Files loaded: 141 # - specifications/technical-spec-v3.md (52,340 chars) 142 # - architecture/overview.md (8,230 chars) 143 # ... 144 ``` 145 146 --- 147 148 ### Phase 2: Claude API Integration (Session 1-2) 149 150 **Objective:** Implement the core API client with review capabilities. 151 152 #### 3.2.1 Tasks 153 154 ``` 155 □ Implement ClaudeCIClient class 156 □ Anthropic client initialization 157 □ Context prompt builder (lazy-loaded) 158 □ System prompt templates per task type 159 160 □ Implement review_pull_request() method 161 □ Structured prompt for PR review 162 □ JSON response parsing with fallback 163 □ Recommendation extraction (APPROVE/REQUEST_CHANGES/COMMENT) 164 165 □ Implement validate_architecture() method 166 □ Git commit introspection 167 □ Spec reference extraction 168 □ Violation severity classification 169 170 □ Implement security_review() method 171 □ Focus area configuration 172 □ CWE ID mapping where applicable 173 □ Merge-blocking logic for critical findings 174 175 □ Implement sync_documentation() method 176 □ Code-to-docs drift detection 177 □ Auto-generated update suggestions 178 □ Changelog entry generation 179 ``` 180 181 #### 3.2.2 Prompt Engineering Notes 182 183 **System Prompt Structure:** 184 ``` 185 1. Role definition (CI assistant with protocol expertise) 186 2. Capability enumeration (Rust, blockchain, ZK, security) 187 3. Task-specific instructions 188 4. Full context injection (specs, arch, governance) 189 5. Output format specification (JSON schema) 190 ``` 191 192 **Critical Context Files to Always Include:** 193 - `specifications/technical-spec-v3.md` — Primary source of truth 194 - `governance/rules.md` — Central Bank authority, validator requirements 195 - `security/requirements.md` — Privacy and cryptographic requirements 196 - `architecture/cross-chain.md` — ALPHA/DELTA bridge specifications 197 198 #### 3.2.3 API Configuration 199 200 ```python 201 # Recommended settings 202 MODEL_PR_REVIEW = "claude-sonnet-4-20250514" # Fast, cost-effective 203 MODEL_SECURITY = "claude-sonnet-4-20250514" # Good security analysis 204 MODEL_ARCHITECTURE = "claude-opus-4-20250514" # Deep reasoning for arch 205 MAX_TOKENS = 8192 # Sufficient for detailed reviews 206 MAX_CONTEXT_TOKENS = 150000 # Leave room for response 207 ``` 208 209 #### 3.2.4 Verification 210 211 ```bash 212 # Test PR review (requires API key) 213 export ANTHROPIC_API_KEY="sk-..." 214 python3 claude_ci.py review --pr 1 215 216 # Test security review 217 git diff HEAD~1 > /tmp/test.diff 218 python3 claude_ci.py security-review --diff /tmp/test.diff 219 220 # Test architecture validation 221 python3 claude_ci.py validate-arch --commit $(git rev-parse HEAD) 222 ``` 223 224 --- 225 226 ### Phase 3: Forgejo Integration (Session 2) 227 228 **Objective:** Create CI workflows and Forgejo API integration. 229 230 #### 3.3.1 Tasks 231 232 ``` 233 □ Implement Forgejo API helpers 234 □ get_pr_diff() — Fetch PR diff via API 235 □ get_pr_files() — List changed files 236 □ post_review_comment() — Post review as PR comment 237 □ set_commit_status() — Update CI status checks 238 239 □ Create primary workflow (claude-review.yml) 240 □ Trigger configuration (PR events) 241 □ Context repository setup job 242 □ Parallel review jobs (PR, security, arch, docs) 243 □ Summary job with GitHub Actions step summary 244 245 □ Create workflow secrets documentation 246 □ ANTHROPIC_API_KEY 247 □ FORGEJO_TOKEN 248 249 □ Implement CI status integration 250 □ Map review outcomes to CI pass/fail 251 □ Security gate for critical findings 252 □ Architecture gate for spec violations 253 ``` 254 255 #### 3.3.2 Workflow Structure 256 257 ```yaml 258 # .forgejo/workflows/claude-review.yml 259 260 jobs: 261 setup-context: # Clone/update context repo 262 runs-on: native 263 264 claude-review: # Full PR review 265 needs: setup-context 266 runs-on: native 267 268 security-review: # Security-focused analysis 269 needs: setup-context 270 runs-on: native 271 272 architecture-validation: # Spec compliance 273 needs: setup-context 274 runs-on: native 275 276 docs-sync: # Documentation drift check 277 needs: setup-context 278 runs-on: native 279 280 review-summary: # Aggregate results 281 needs: [claude-review, security-review, architecture-validation, docs-sync] 282 runs-on: native 283 ``` 284 285 #### 3.3.3 Verification 286 287 ```bash 288 # Create test PR 289 git checkout -b test/claude-ci-integration 290 echo "// test" >> src/lib.rs 291 git commit -am "test: trigger Claude CI" 292 git push origin test/claude-ci-integration 293 294 # Open PR via Forgejo UI 295 # Verify workflow triggers and completes 296 # Check PR comments for Claude review 297 ``` 298 299 --- 300 301 ### Phase 4: Advanced Features (Session 3) 302 303 **Objective:** Implement enhancement features and hardening. 304 305 #### 3.4.1 Tasks 306 307 ``` 308 □ Implement implementation suggestion generator 309 □ suggest command for new features 310 □ Code skeleton generation 311 □ Test strategy recommendations 312 313 □ Add caching layer 314 □ Cache context prompt (invalidate on context repo changes) 315 □ Cache API responses for identical diffs (short TTL) 316 317 □ Implement rate limiting 318 □ Per-PR limits (avoid runaway costs) 319 □ Backoff on API errors 320 321 □ Add observability 322 □ Token usage logging 323 □ Response time metrics 324 □ Cost tracking per review type 325 326 □ Create manual trigger workflow 327 □ workflow_dispatch for on-demand reviews 328 □ Review type selection (full/security/arch) 329 330 □ Implement merge gate integration 331 □ Required status check configuration 332 □ Override mechanism for maintainers 333 ``` 334 335 #### 3.4.2 Cost Controls 336 337 ```python 338 # Implement in claude_ci.py 339 340 class CostTracker: 341 """Track and limit API costs.""" 342 343 PRICE_PER_1K_INPUT = { 344 "claude-sonnet-4-20250514": 0.003, 345 "claude-opus-4-20250514": 0.015, 346 } 347 348 PRICE_PER_1K_OUTPUT = { 349 "claude-sonnet-4-20250514": 0.015, 350 "claude-opus-4-20250514": 0.075, 351 } 352 353 MAX_COST_PER_PR = 1.00 # $1 max per PR 354 MAX_COST_PER_DAY = 20.00 # $20 max per day 355 ``` 356 357 #### 3.4.3 Verification 358 359 ```bash 360 # Test rate limiting 361 for i in {1..10}; do 362 python3 claude_ci.py review --pr 1 363 done 364 # Should see rate limit warnings after threshold 365 366 # Test cost tracking 367 python3 claude_ci.py stats --period day 368 # Expected: Token usage and cost breakdown 369 ``` 370 371 --- 372 373 ## 4. File Manifest 374 375 ### 4.1 Files to Create 376 377 | File | Repository | Description | 378 |------|------------|-------------| 379 | `tools/claude_ci.py` | alpha-delta-context | Main integration script | 380 | `tools/requirements.txt` | alpha-delta-context | Python dependencies | 381 | `tools/config.example.yaml` | alpha-delta-context | Configuration template | 382 | `.forgejo/workflows/claude-review.yml` | alpha-delta-protocol | CI workflow | 383 | `docs/ci/CLAUDE_INTEGRATION.md` | alpha-delta-context | Setup documentation | 384 385 ### 4.2 Files to Modify 386 387 | File | Repository | Changes | 388 |------|------------|---------| 389 | `README.md` | alpha-delta-context | Add Claude CI section | 390 | `.forgejo/workflows/ci.yml` | alpha-delta-protocol | Add Claude job dependencies (optional) | 391 392 ### 4.3 Secrets to Configure 393 394 | Secret | Scope | Description | 395 |--------|-------|-------------| 396 | `ANTHROPIC_API_KEY` | Repository | Anthropic API key | 397 | `FORGEJO_TOKEN` | Repository | Forgejo PAT for comments | 398 399 --- 400 401 ## 5. Test Plan 402 403 ### 5.1 Unit Tests 404 405 ```python 406 # tests/test_context_loader.py 407 408 def test_context_loader_finds_spec_files(): 409 loader = ContextLoader(config) 410 files = loader.load_context_files() 411 assert "specifications/technical-spec-v3.md" in files 412 413 def test_context_prompt_includes_all_sections(): 414 loader = ContextLoader(config) 415 prompt = loader.build_context_prompt() 416 assert "## Technical Specifications" in prompt 417 assert "## Governance Rules" in prompt 418 419 def test_context_respects_token_limit(): 420 config = CIConfig(max_context_tokens=1000) 421 loader = ContextLoader(config) 422 prompt = loader.build_context_prompt() 423 assert len(prompt) // 4 < 1000 # Rough token estimate 424 ``` 425 426 ### 5.2 Integration Tests 427 428 ```bash 429 # tests/integration/test_full_review.sh 430 431 #!/bin/bash 432 set -e 433 434 # Setup 435 export ANTHROPIC_API_KEY="${TEST_API_KEY}" 436 export CONTEXT_REPO_PATH="./test-context" 437 438 # Create minimal context 439 mkdir -p test-context/specifications 440 echo "# Test Spec" > test-context/specifications/test.md 441 442 # Create test diff 443 cat > /tmp/test.diff << 'EOF' 444 diff --git a/src/lib.rs b/src/lib.rs 445 index abc123..def456 100644 446 --- a/src/lib.rs 447 +++ b/src/lib.rs 448 @@ -1,3 +1,5 @@ 449 +// Added new function 450 +fn new_validator_logic() {} 451 EOF 452 453 # Run review 454 python3 claude_ci.py security-review --diff /tmp/test.diff 455 456 # Verify output 457 # Should complete without error and return JSON 458 ``` 459 460 ### 5.3 End-to-End Tests 461 462 ``` 463 □ Create test PR with known security issue 464 □ Verify security review catches it 465 □ Verify PR comment is posted 466 □ Verify CI status is set correctly 467 468 □ Create test PR with architecture violation 469 □ Verify architecture validation catches it 470 □ Verify spec section is referenced 471 472 □ Create test PR requiring doc updates 473 □ Verify docs-sync identifies needed updates 474 □ Verify suggested content is reasonable 475 ``` 476 477 --- 478 479 ## 6. Rollout Plan 480 481 ### 6.1 Phase 1: Shadow Mode (Week 1) 482 483 - Deploy integration with `post-comment: false` 484 - Log all reviews to artifacts only 485 - Monitor token usage and costs 486 - Tune prompts based on output quality 487 488 ### 6.2 Phase 2: Advisory Mode (Week 2-3) 489 490 - Enable PR comments 491 - Reviews are informational only (don't block merges) 492 - Gather team feedback on review quality 493 - Adjust sensitivity thresholds 494 495 ### 6.3 Phase 3: Enforcement Mode (Week 4+) 496 497 - Enable security gate for critical findings 498 - Enable architecture gate for spec violations 499 - Configure as required status check 500 - Document override procedures 501 502 --- 503 504 ## 7. Operational Considerations 505 506 ### 7.1 Cost Management 507 508 | Review Type | Est. Cost | Frequency | Monthly Est. | 509 |-------------|-----------|-----------|--------------| 510 | PR Review | $0.17 | 80/month | $13.60 | 511 | Security | $0.10 | 80/month | $8.00 | 512 | Architecture | $0.19 | 80/month | $15.20 | 513 | Docs Sync | $0.13 | 80/month | $10.40 | 514 | **Total** | | | **~$47/month** | 515 516 ### 7.2 Failure Modes 517 518 | Failure | Impact | Mitigation | 519 |---------|--------|------------| 520 | API unavailable | Reviews skip | `continue-on-error: true` in workflow | 521 | Rate limited | Delayed reviews | Exponential backoff, queue | 522 | Context repo unavailable | Reviews lack context | Cache last-known-good context | 523 | Malformed response | Parse error | Fallback to raw output, manual review | 524 525 ### 7.3 Monitoring 526 527 ```yaml 528 # Alerts to configure 529 - name: Claude CI API Errors 530 condition: error_rate > 5% over 1h 531 532 - name: Claude CI Cost Spike 533 condition: daily_cost > $30 534 535 - name: Claude CI Latency 536 condition: p95_latency > 120s 537 ``` 538 539 --- 540 541 ## 8. Security Considerations 542 543 ### 8.1 API Key Protection 544 545 - Store in Forgejo secrets (encrypted at rest) 546 - Never log or expose in outputs 547 - Rotate quarterly 548 549 ### 8.2 Context Sensitivity 550 551 - Context repo may contain sensitive architectural details 552 - Ensure context repo has appropriate access controls 553 - Consider redacting sensitive sections for lower-privilege reviews 554 555 ### 8.3 Output Sanitization 556 557 - Review Claude's output before posting publicly 558 - Strip any accidentally leaked secrets 559 - Validate JSON structure before parsing 560 561 --- 562 563 ## 9. Documentation Requirements 564 565 ### 9.1 User Documentation 566 567 - [ ] Setup guide for new repositories 568 - [ ] Configuration options reference 569 - [ ] Troubleshooting guide 570 - [ ] FAQ 571 572 ### 9.2 Developer Documentation 573 574 - [ ] Architecture overview 575 - [ ] API client reference 576 - [ ] Extending with custom review types 577 - [ ] Contributing guidelines 578 579 ### 9.3 Operations Documentation 580 581 - [ ] Runbook for common issues 582 - [ ] Cost monitoring procedures 583 - [ ] Incident response for false positives 584 585 --- 586 587 ## 10. Implementation Commands 588 589 When implementing this feature, use the following sequence: 590 591 ```bash 592 # Session 1: Core Infrastructure 593 claude-code "Implement Phase 1 of CSPEC-2026-001: Create ContextLoader class 594 and CLI structure in alpha-delta-context/tools/claude_ci.py. Follow the 595 specification in the CSPEC document." 596 597 # Session 1-2: API Integration 598 claude-code "Implement Phase 2 of CSPEC-2026-001: Add ClaudeCIClient class 599 with review_pull_request, validate_architecture, security_review, and 600 sync_documentation methods. Include structured JSON output parsing." 601 602 # Session 2: Forgejo Integration 603 claude-code "Implement Phase 3 of CSPEC-2026-001: Create Forgejo workflow 604 at .forgejo/workflows/claude-review.yml with parallel review jobs and 605 PR comment posting." 606 607 # Session 3: Advanced Features 608 claude-code "Implement Phase 4 of CSPEC-2026-001: Add caching, rate limiting, 609 cost tracking, and manual trigger workflow." 610 ``` 611 612 --- 613 614 ## 11. Reference Implementation 615 616 The following files contain a reference implementation that can be used as a starting point: 617 618 - `claude_ci.py` — Full Python implementation 619 - `claude-review.yml` — Forgejo workflow 620 - `CLAUDE_CI_SETUP.md` — Setup documentation 621 622 These files were generated during planning and should be reviewed/adapted during implementation. 623 624 --- 625 626 ## 12. Acceptance Criteria 627 628 ### 12.1 Functional Requirements 629 630 - [ ] PRs automatically receive Claude review comments 631 - [ ] Security findings are categorized by severity 632 - [ ] Architecture violations reference specific spec sections 633 - [ ] Documentation drift is detected and reported 634 - [ ] Manual trigger workflow is available 635 636 ### 12.2 Non-Functional Requirements 637 638 - [ ] Reviews complete within 5 minutes 639 - [ ] API errors don't block CI pipeline 640 - [ ] Costs stay under $100/month at 80 PRs/month 641 - [ ] False positive rate < 10% 642 643 ### 12.3 Documentation Requirements 644 645 - [ ] Setup guide is complete and tested 646 - [ ] All configuration options are documented 647 - [ ] Troubleshooting guide covers common issues 648 649 --- 650 651 **CSPEC Status:** READY FOR IMPLEMENTATION 652 653 **Assignee:** Claude-Code (future session) 654 655 **Reviewer:** Marco 656 657 **Approval:** Pending 658 659 --- 660 661 *This CSPEC follows the Alpha/Delta Protocol documentation standards and is designed for implementation by Claude-Code with minimal human intervention.*