AGENTS.md
1 # Multi-Agent System Guide 2 3 ## Table of Contents 4 5 - [Overview](#overview) 6 - [Architecture](#architecture) 7 - [Getting Started](#getting-started) 8 - [Agents](#agents) 9 - [Workflows](#workflows) 10 - [CLI Commands](#cli-commands) 11 - [Configuration](#configuration) 12 - [Safety Features](#safety-features) 13 - [Cost Management](#cost-management) 14 - [Troubleshooting](#troubleshooting) 15 - [Best Practices](#best-practices) 16 17 --- 18 19 ## Overview 20 21 The 333 Method uses a database-driven multi-agent system where specialized AI agents collaborate autonomously to handle development, testing, security, and architecture tasks. 22 23 ### Benefits 24 25 - **Token efficiency**: 75-85% reduction vs monolithic approach (20-25KB per invocation vs 100-150KB) 26 - **Specialization**: Each agent has focused responsibilities and optimized context 27 - **Peer review**: Built-in workflows ensure quality through agent collaboration 28 - **Autonomy**: Agents work continuously via cron scheduling 29 - **Audit trail**: Complete tracking of all agent actions and decisions 30 31 ### How It Works 32 33 1. **Monitor Agent** scans logs every 5 minutes and creates tasks for detected issues 34 2. **Triage Agent** classifies errors and routes tasks to appropriate agents 35 3. **Developer Agent** fixes bugs and implements features 36 4. **QA Agent** verifies fixes and enforces test coverage gates 37 5. **Security Agent** performs security reviews and compliance checks 38 6. **Architect Agent** reviews designs and maintains documentation freshness 39 40 Agents communicate through a database-driven message queue, creating a collaborative workflow where each agent builds on others' work. 41 42 --- 43 44 ## Architecture 45 46 ### Core Components 47 48 #### 1. Database Tables (Migration 041, 051) 49 50 **agent_tasks** - Task queue with priority and status tracking 51 52 ```sql 53 CREATE TABLE agent_tasks ( 54 id INTEGER PRIMARY KEY AUTOINCREMENT, 55 task_type TEXT NOT NULL, 56 assigned_to TEXT NOT NULL, 57 status TEXT NOT NULL, 58 priority INTEGER DEFAULT 5, 59 parent_task_id INTEGER, 60 context_json TEXT, 61 result_json TEXT, 62 retry_count INTEGER DEFAULT 0, 63 reviewed_by TEXT, 64 approval_json TEXT, 65 created_at TEXT DEFAULT CURRENT_TIMESTAMP 66 ); 67 ``` 68 69 **agent_messages** - Inter-agent communication 70 71 ```sql 72 CREATE TABLE agent_messages ( 73 id INTEGER PRIMARY KEY AUTOINCREMENT, 74 task_id INTEGER NOT NULL, 75 from_agent TEXT NOT NULL, 76 to_agent TEXT NOT NULL, 77 message_type TEXT NOT NULL, 78 message_text TEXT, 79 metadata_json TEXT, 80 created_at TEXT DEFAULT CURRENT_TIMESTAMP 81 ); 82 ``` 83 84 **agent_logs** - Execution audit trail 85 86 ```sql 87 CREATE TABLE agent_logs ( 88 id INTEGER PRIMARY KEY AUTOINCREMENT, 89 task_id INTEGER, 90 agent_name TEXT NOT NULL, 91 level TEXT NOT NULL, 92 message TEXT NOT NULL, 93 metadata_json TEXT, 94 created_at TEXT DEFAULT CURRENT_TIMESTAMP 95 ); 96 ``` 97 98 **agent_state** - Agent status and metrics 99 100 ```sql 101 CREATE TABLE agent_state ( 102 agent_name TEXT PRIMARY KEY, 103 status TEXT NOT NULL, 104 current_task_id INTEGER, 105 last_run_at TEXT, 106 metrics_json TEXT 107 ); 108 ``` 109 110 **agent_outcomes** - Task outcomes for learning (Migration 052) 111 112 ```sql 113 CREATE TABLE agent_outcomes ( 114 id INTEGER PRIMARY KEY AUTOINCREMENT, 115 task_id INTEGER NOT NULL REFERENCES agent_tasks(id) ON DELETE CASCADE, 116 agent_name TEXT NOT NULL, 117 task_type TEXT NOT NULL, 118 outcome TEXT NOT NULL CHECK(outcome IN ('success', 'failure')), 119 context_json TEXT, -- Task-specific context (error_type, file_path, etc.) 120 result_json TEXT, -- Task result details (what worked, what didn't) 121 duration_ms INTEGER, 122 created_at DATETIME DEFAULT CURRENT_TIMESTAMP 123 ); 124 ``` 125 126 This table enables **task history and learning** - agents learn from past successes and failures to improve future performance. See [docs/agents/task-history.md](./agents/task-history.md) for details. 127 128 #### 2. Context Files 129 130 Each agent loads a base context (~15KB) plus role-specific context: 131 132 | Agent | Context Files | Total Size | 133 | --------- | ---------------------- | ---------- | 134 | Monitor | base.md + monitor.md | 20KB | 135 | Triage | base.md + triage.md | 23.5KB | 136 | Developer | base.md + developer.md | 21.3KB | 137 | QA | base.md + qa.md | 23KB | 138 | Security | base.md + security.md | 21KB | 139 | Architect | base.md + architect.md | 25KB | 140 141 **Location:** `/home/jason/code/333Method/src/agents/contexts/` 142 143 #### 3. Agent Framework 144 145 **BaseAgent class** (`src/agents/base-agent.js`) 146 147 - Task polling and execution 148 - Message sending/receiving 149 - Logging and error handling 150 - Circuit breaker integration 151 152 **Utility modules:** 153 154 - `context-loader.js` - Merges context files 155 - `context-builder.js` - Enriches context with task history for learning 156 - `task-manager.js` - CRUD operations for tasks 157 - `message-manager.js` - Inter-agent messaging 158 159 ### Workflow States 160 161 Tasks progress through these states: 162 163 ``` 164 pending → running → completed 165 ↓ 166 awaiting_po_approval → approved → pending 167 ↓ 168 awaiting_architect_approval → approved → pending 169 ↓ 170 failed 171 ↓ 172 blocked 173 ``` 174 175 **State Descriptions:** 176 177 - `pending` - Ready to work on 178 - `running` - Currently being processed by an agent 179 - `awaiting_po_approval` - Design proposal waiting for Product Owner sign-off 180 - `awaiting_architect_approval` - Implementation plan waiting for technical review 181 - `completed` - Successfully finished 182 - `failed` - Failed after 3 retry attempts 183 - `blocked` - Blocked on external dependency or human action 184 185 --- 186 187 ## Getting Started 188 189 ### Prerequisites 190 191 1. Database initialized with agent tables (migration 041, 051) 192 2. Environment variable set: `AGENT_SYSTEM_ENABLED=true` 193 3. Cron system enabled to run agents every 5 minutes 194 195 ### Quick Start 196 197 #### 1. Enable the Agent System 198 199 ```bash 200 # Add to .env 201 echo "AGENT_SYSTEM_ENABLED=true" >> .env 202 ``` 203 204 #### 2. Bootstrap the Monitor Agent 205 206 The Monitor agent needs an initial task to start its self-scheduling loop: 207 208 ```bash 209 npm run agent:create -- --agent monitor --task scan_logs --context '{"incremental":true}' --priority 5 210 ``` 211 212 #### 3. Verify Agents Are Running 213 214 ```bash 215 # Check agent status 216 npm run agent:list 217 218 # View pending tasks 219 npm run agent:tasks 220 221 # View recent logs 222 npm run agent:logs -- --level info 223 ``` 224 225 #### 4. Trigger a Test Workflow 226 227 ```bash 228 # Test bug fix workflow 229 npm run agent:workflow -- --workflow bug-fix --error "Test error for verification" --stage scoring 230 231 # Check workflow status 232 npm run agent:tasks 233 ``` 234 235 --- 236 237 ## Agents 238 239 ### 1. Monitor Agent 240 241 **Role:** System immune system - proactive detection of issues 242 243 **Responsibilities:** 244 245 - Scan log files for ERROR/FATAL patterns every 5 minutes 246 - Detect looping errors (same error >3x in 1 hour) 247 - Monitor stale tasks (pending >1 hour) 248 - Verify process compliance (expected stage transitions) 249 - Track agent health (success/failure ratios) 250 - Check documentation drift daily 251 252 **Task Types:** 253 254 - `scan_logs` - Incremental log scanning (self-scheduling) 255 - `check_agent_health` - Monitor agent success rates 256 - `check_process_compliance` - Verify workflow adherence 257 - `check_doc_freshness` - Detect stale documentation 258 259 **Context Size:** 20KB (base.md + monitor.md) 260 261 **Self-Scheduling:** Creates new `scan_logs` task after each completion 262 263 **Example:** 264 265 ```bash 266 # View Monitor status 267 npm run agent:list | grep monitor 268 269 # View Monitor logs 270 npm run agent:logs -- --agent-name monitor 271 ``` 272 273 ### 2. Triage Agent 274 275 **Role:** Error classifier and task router 276 277 **Responsibilities:** 278 279 - Classify errors by type (null_pointer, network, database_constraint, api_error, security, configuration) 280 - Determine severity (critical, high, medium, low) 281 - Calculate priority (1-10 scale based on severity + impact) 282 - Route tasks to appropriate agents 283 - Suggest initial fix approaches 284 285 **Task Types:** 286 287 - `classify_error` - Analyze error and create appropriate task 288 289 **Context Size:** 23.5KB (base.md + triage.md) 290 291 **Routing Logic:** 292 293 - Security errors → Security Agent (priority 10) 294 - Database/network/API errors → Developer Agent 295 - Complex architectural issues → Architect Agent 296 - Configuration errors → Developer Agent 297 298 **Example:** 299 300 ```bash 301 # Manually trigger triage 302 npm run agent:create -- --agent triage --task classify_error --context '{"error":"TypeError: Cannot read property score of null","file":"src/scoring.js"}' --priority 7 303 ``` 304 305 ### 3. Developer Agent 306 307 **Role:** Bug fixes and feature implementation 308 309 **Responsibilities:** 310 311 - Analyze error messages and stack traces 312 - Extract affected file paths 313 - Generate bug fixes 314 - Implement new features 315 - **CRITICAL:** Enforce 85%+ code coverage before commits 316 - Create git commits (only if coverage gate passes) 317 - Hand off to QA for verification 318 319 **Task Types:** 320 321 - `fix_bug` - Analyze and fix bugs 322 - `implement_feature` - Build new features 323 - `implementation_plan` - Create detailed implementation plan 324 325 **Context Size:** 21.3KB (base.md + developer.md) 326 327 **Coverage Gate:** 328 Developer enforces 85%+ coverage BEFORE creating commits: 329 330 1. Make code changes 331 2. Run `checkCoverageBeforeCommit(files, taskId)` 332 3. If coverage <85%: Attempt automatic test generation 333 4. If auto-fix fails: Escalate to Architect for guidance 334 5. Only commit if coverage ≥85% 335 336 **Workflow Example:** 337 338 ``` 339 1. Receive fix_bug task from Triage 340 2. Analyze error and identify affected files 341 3. Generate fix 342 4. Run coverage check 343 5. If coverage passes: Create commit 344 6. Create verify_fix task for QA 345 7. Send handoff message to QA 346 ``` 347 348 **Example:** 349 350 ```bash 351 # View Developer tasks 352 npm run agent:tasks -- --assigned-to developer 353 354 # Trigger bug fix 355 npm run agent:workflow -- --workflow bug-fix --error "..." --file src/scoring.js 356 ``` 357 358 ### 4. QA Agent 359 360 **Role:** Test generation, verification, coverage enforcement 361 362 **Responsibilities:** 363 364 - Generate unit tests for new features 365 - Verify bug fixes work correctly 366 - Enforce 80%+ coverage gate (HARD BLOCK on task completion) 367 - Run test suite and parse coverage reports 368 - Create feedback for developers on failures 369 - Tag regression tests 370 371 **Task Types:** 372 373 - `write_test` - Generate unit test 374 - `verify_fix` - Verify bug fix works 375 - `check_coverage` - Ensure 80%+ coverage 376 - `write_missing_tests` - Fill coverage gaps 377 378 **Context Size:** 23KB (base.md + qa.md) 379 380 **Coverage Gate:** 381 QA enforces 80%+ coverage AFTER commits as a second safety layer: 382 383 1. Receive verify_fix task 384 2. Run tests for changed files 385 3. Check coverage with c8 386 4. If <80%: Create write_missing_tests task, block parent task 387 5. If ≥80%: Mark task complete 388 389 **Example:** 390 391 ```bash 392 # View QA tasks 393 npm run agent:tasks -- --assigned-to qa 394 395 # Check recent verifications 396 npm run agent:logs -- --agent-name qa --level info 397 ``` 398 399 ### 5. Security Agent 400 401 **Role:** Security audits, compliance, vulnerability scanning 402 403 **Responsibilities:** 404 405 - Code security reviews (SQL injection, XSS, command injection) 406 - Dependency vulnerability scanning (`npm audit`) 407 - Secrets detection (hardcoded keys, credentials) 408 - TCPA/CAN-SPAM/GDPR compliance validation 409 - Track vulnerability remediation time 410 411 **Task Types:** 412 413 - `audit_code` - Security code review 414 - `scan_dependencies` - Check for vulnerable dependencies 415 - `compliance_check` - Validate TCPA/CAN-SPAM adherence 416 - `scan_secrets` - Detect exposed credentials 417 418 **Context Size:** 21KB (base.md + security.md) 419 420 **Example:** 421 422 ```bash 423 # Trigger security audit 424 npm run agent:create -- --agent security --task audit_code --context '{"files":["src/outreach/sms.js"]}' --priority 8 425 426 # View security findings 427 npm run agent:logs -- --agent-name security --level error 428 ``` 429 430 ### 6. Architect Agent 431 432 **Role:** Design review, refactoring, documentation freshness 433 434 **Responsibilities:** 435 436 - Design reviews for new features 437 - Refactoring suggestions based on complexity analysis 438 - Code complexity monitoring (max 150 lines, complexity 15) 439 - Documentation freshness checks 440 - Schema change validation 441 - Create Architecture Decision Records (ADRs) 442 443 **Task Types:** 444 445 - `design_proposal` - Create design document for significant changes 446 - `technical_review` - Review implementation plans 447 - `suggest_refactor` - Recommend refactoring 448 - `update_documentation` - Fix stale docs 449 - `review_design` - Evaluate feature designs 450 451 **Context Size:** 25KB (base.md + architect.md) 452 453 **Documentation Freshness Checks:** 454 On every commit, Architect verifies: 455 456 - New env vars → `.env.example` updated? 457 - New npm scripts → `README.md` updated? 458 - New modules → `CLAUDE.md` updated? 459 - Schema changes → `db/schema.sql` + migration? 460 - Features done → `docs/TODO.md` updated? 461 462 **Example:** 463 464 ```bash 465 # Request design review 466 npm run agent:create -- --agent architect --task design_proposal --context '{"feature":"Dark mode toggle","requirements":["Settings UI","Persistence","Global theme"]}' --priority 6 467 468 # View pending reviews 469 npm run agent:tasks -- --assigned-to architect --status awaiting_po_approval 470 ``` 471 472 --- 473 474 ## Task Routing 475 476 The agent system uses a centralized task routing configuration to ensure tasks are always assigned to the correct agent. 477 478 ### Routing Configuration 479 480 **Location:** `src/agents/utils/task-routing.js` 481 482 This module provides: 483 484 - `TASK_ROUTING` - Complete mapping of task types to agents 485 - `getAgentForTaskType(taskType)` - Get correct agent for a task type 486 - `validateTaskAssignment(taskType, assignedTo)` - Validate task is correctly routed 487 - `getTaskTypesForAgent(agentName)` - Get all task types an agent handles 488 489 ### Complete Task Type Reference 490 491 | Task Type | Agent | Description | 492 | ------------------------------- | --------- | ---------------------------------------------- | 493 | **Developer Tasks** | | | 494 | `fix_bug` | developer | Fix bugs identified by Triage | 495 | `implement_feature` | developer | Implement new features after design approval | 496 | `refactor_code` | developer | Refactor complex or problematic code | 497 | `apply_feedback` | developer | Address feedback from other agents | 498 | `implementation_plan` | developer | Create detailed implementation plan | 499 | **QA Tasks** | | | 500 | `write_test` | qa | Generate unit tests for code | 501 | `verify_fix` | qa | Verify bug fix works correctly | 502 | `check_coverage` | qa | Check test coverage meets 80%+ requirement | 503 | `run_tests` | qa | Run test suite for files | 504 | **Security Tasks** | | | 505 | `audit_code` | security | Security code review (SQL injection, XSS, etc) | 506 | `scan_dependencies` | security | Check for vulnerable dependencies | 507 | `verify_compliance` | security | Validate TCPA/CAN-SPAM/GDPR compliance | 508 | `scan_secrets` | security | Detect exposed credentials | 509 | `threat_model` | security | STRIDE threat modeling for component | 510 | `fix_security_issue` | security | Auto-fix security vulnerabilities | 511 | `review_dependency_update` | security | Review dependency updates for security | 512 | **Architect Tasks** | | | 513 | `design_proposal` | architect | Create design proposal for features | 514 | `technical_review` | architect | Review implementation plan for soundness | 515 | `review_design` | architect | Review design against principles | 516 | `suggest_refactor` | architect | Suggest refactoring for complex code | 517 | `update_documentation` | architect | Update documentation with Claude API | 518 | `check_documentation_freshness` | architect | Check for stale documentation | 519 | `check_complexity` | architect | Check code complexity metrics | 520 | `audit_documentation` | architect | Verify documentation matches reality | 521 | `check_branch_health` | architect | Check for stale branches | 522 | `profile_performance` | architect | Profile pipeline performance | 523 | `review_documentation` | architect | Review documentation accuracy | 524 | **Triage Tasks** | | | 525 | `classify_error` | triage | Classify error and route to agent | 526 | `route_task` | triage | Route generic task to agent | 527 | `prioritize_tasks` | triage | Prioritize pending tasks | 528 | **Monitor Tasks** | | | 529 | `scan_logs` | monitor | Scan logs for errors (self-scheduling) | 530 | `check_agent_health` | monitor | Monitor agent success rates | 531 | `check_process_compliance` | monitor | Verify workflow adherence | 532 | `detect_anomaly` | monitor | Detect anomalous behavior | 533 | `check_pipeline_health` | monitor | Check pipeline for blockages | 534 | `check_slo_compliance` | monitor | Check SLO compliance metrics | 535 536 ### Auto-Delegation 537 538 When an agent receives a task type it doesn't handle, it automatically delegates to the correct agent using `BaseAgent.delegateToCorrectAgent()`: 539 540 **Example:** If `implement_feature` is mistakenly assigned to `monitor`: 541 542 1. Monitor calls `delegateToCorrectAgent(task)` 543 2. Creates new task assigned to `developer` 544 3. Completes original task with delegation note 545 4. Logs routing correction for analysis 546 547 This prevents "Unknown task type" errors and ensures no tasks are lost due to misrouting. 548 549 ### Common Routing Errors Fixed 550 551 **Before (Errors):** 552 553 - `implement_feature` → monitor, triage, qa, security, architect ❌ 554 - `fix_bug` → architect ❌ 555 - `review_documentation` → unknown ❌ 556 - `review_dependency_update` → unknown ❌ 557 558 **After (Correct Routing):** 559 560 - `implement_feature` → developer ✅ 561 - `fix_bug` → developer ✅ 562 - `review_documentation` → architect ✅ 563 - `review_dependency_update` → security ✅ 564 565 ### Testing 566 567 Run task routing tests: 568 569 ```bash 570 node --test tests/agents/task-routing.test.js 571 ``` 572 573 This validates all task types are correctly mapped and delegation works properly. 574 575 --- 576 577 ## Workflows 578 579 ### Standard Workflow Types 580 581 #### 1. Feature Implementation (Significant) 582 583 Used for breaking changes, database migrations, or features >4 hours effort. 584 585 ``` 586 Product Request 587 ↓ 588 Architect: design_proposal 589 ↓ 590 Status: awaiting_po_approval 591 ↓ 592 PO Reviews → Approves/Rejects 593 ↓ (approved) 594 Developer: implementation_plan 595 ↓ 596 Status: awaiting_architect_approval 597 ↓ 598 Architect: technical_review → Approves/Rejects 599 ↓ (approved) 600 Developer: implement_feature 601 ↓ 602 QA: verify_fix 603 ↓ 604 Security: audit_code (if needed) 605 ``` 606 607 **Example:** 608 609 ```bash 610 npm run agent:workflow -- --workflow feature --description "Add two-factor authentication" --requirements '["SMS OTP","Email backup codes","Recovery process"]' 611 612 # View approval queue 613 npm run agent:approvals -- --status awaiting_po_approval 614 615 # Approve design 616 npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision approved 617 ``` 618 619 #### 2. Feature Implementation (Minor) 620 621 Used for small features ≤4 hours, no breaking changes or migrations. 622 623 ``` 624 Product Request 625 ↓ 626 Architect: design_proposal (auto-approved) 627 ↓ 628 Developer: implementation_plan 629 ↓ 630 Architect: technical_review 631 ↓ 632 Developer: implement_feature 633 ↓ 634 QA: verify_fix 635 ``` 636 637 **Example:** 638 639 ```bash 640 npm run agent:workflow -- --workflow feature --description "Add logging to enrich stage" --requirements '["Log contact count","Log errors"]' 641 ``` 642 643 #### 3. Bug Fix (Architectural) 644 645 For bugs affecting multiple modules or requiring schema changes. 646 647 ``` 648 Error Detected 649 ↓ 650 Triage: classify_error → architectural 651 ↓ 652 Architect: design_proposal 653 ↓ 654 Status: awaiting_po_approval 655 ↓ 656 PO Approves 657 ↓ 658 Developer: implementation_plan 659 ↓ 660 Architect: technical_review 661 ↓ 662 Developer: fix_bug 663 ↓ 664 QA: verify_fix 665 ``` 666 667 #### 4. Bug Fix (Standard) 668 669 For isolated bugs in a single file with low complexity. 670 671 ``` 672 Error Detected 673 ↓ 674 Triage: classify_error → simple 675 ↓ 676 Developer: fix_bug 677 ↓ 678 QA: verify_fix 679 ``` 680 681 **Example:** 682 683 ```bash 684 npm run agent:workflow -- --workflow bug-fix --error "TypeError: Cannot read property 'score' of null" --file src/scoring.js --stack "..." 685 ``` 686 687 #### 5. Refactor Workflow 688 689 For code complexity reduction or architectural improvements. 690 691 ``` 692 Complexity Detected 693 ↓ 694 Architect: design_proposal 695 ↓ 696 Developer: implementation_plan 697 ↓ 698 Architect: technical_review 699 ↓ 700 Developer: implement refactoring 701 ↓ 702 QA: verify_fix (ensure no regressions) 703 ``` 704 705 **Example:** 706 707 ```bash 708 npm run agent:workflow -- --workflow refactor --file src/complex-module.js --reason "Cyclomatic complexity exceeds 15" 709 ``` 710 711 ### Approval System 712 713 #### Product Owner Approval 714 715 **Required for:** 716 717 - Breaking changes 718 - Database migrations 719 - Features with >4 hours estimated effort 720 - Changes explicitly marked "significant" 721 722 **Process:** 723 724 1. Architect creates design_proposal task 725 2. Task status → `awaiting_po_approval` 726 3. Task appears in human_review_queue 727 4. PO reviews via CLI: `npm run agent:approvals` 728 5. PO approves/rejects via: `npm run agent:approve` 729 730 **Approval Schema:** 731 732 ```json 733 { 734 "decision": "approved | approved_with_conditions | rejected", 735 "reviewer": "Jason", 736 "timestamp": "2026-02-15T10:30:00Z", 737 "notes": "Looks good, keep scope tight", 738 "conditions": ["Max 2 files", "No new dependencies"] 739 } 740 ``` 741 742 #### Architect Approval 743 744 **Required for:** 745 746 - All implementation plans 747 - Refactorings 748 - Performance optimizations 749 750 **Review Criteria:** 751 752 - Files won't exceed 150 lines 753 - Test coverage ≥85% 754 - Documentation updated 755 - No circular dependencies 756 - Follows architectural patterns 757 758 **Process:** 759 760 1. Developer creates implementation_plan 761 2. Task status → `awaiting_architect_approval` 762 3. Architect agent reviews plan 763 4. Creates technical_review task 764 5. Approves → status back to `pending`, Developer proceeds 765 6. Rejects → feedback to Developer, plan revised 766 767 --- 768 769 ## Horizontal Scaling 770 771 The agent system supports horizontal scaling through row-level task locking, allowing multiple instances of the same agent to run concurrently without conflicts. 772 773 ### How It Works 774 775 **Row-Level Locking:** 776 777 - Each agent instance atomically claims individual tasks from the database 778 - SQLite transactions ensure only one instance can claim any given task 779 - Multiple instances safely process different tasks simultaneously 780 - No duplicate processing, even with 5+ concurrent instances 781 782 **Configuration:** 783 784 ```env 785 # Enable row-level locking (default: true) 786 AGENT_ENABLE_ROW_LOCKING=true 787 788 # Allow concurrent instances of same agent (default: false) 789 AGENT_ALLOW_CONCURRENT_INSTANCES=true 790 ``` 791 792 ### Running Multiple Instances 793 794 **Example: 3 Developer Agents:** 795 796 ```bash 797 # Terminal 1 798 npm run agent:run:single developer & 799 800 # Terminal 2 801 npm run agent:run:single developer & 802 803 # Terminal 3 804 npm run agent:run:single developer & 805 ``` 806 807 All three instances will process different tasks concurrently. Work distribution is automatic and race-condition safe. 808 809 ### Performance Benefits 810 811 **Task Throughput:** 812 813 - 1 developer agent: ~5-10 tasks/hour (depending on complexity) 814 - 3 developer agents: ~15-30 tasks/hour (3x throughput) 815 - 5 developer agents: ~25-50 tasks/hour (5x throughput, diminishing returns) 816 817 **When to Scale:** 818 819 - High task queue depth (>20 pending tasks) 820 - Long-running tasks (>5 minutes each) 821 - Time-sensitive workflows (critical bug fixes) 822 823 ### Safety Mechanisms 824 825 **Agent-Level Locking (Optional):** 826 827 ```env 828 # Disable for horizontal scaling 829 AGENT_ALLOW_CONCURRENT_INSTANCES=true 830 ``` 831 832 When disabled (default), only one instance of each agent runs at a time (backwards compatible). 833 834 **Task States:** 835 836 - `pending` → Available for claiming 837 - `running` → Claimed by an instance (atomic transition) 838 - `completed` → Finished successfully 839 - `failed` → Error after max retries 840 841 **Edge Cases Handled:** 842 843 - Race conditions: Transaction ensures atomic claiming 844 - Crashed instances: Stale lock cleanup after 2 minutes 845 - Duplicate processing: Prevented by atomic UPDATE WHERE status='pending' 846 847 ### Monitoring Concurrent Agents 848 849 **View running instances:** 850 851 ```bash 852 # Check agent states 853 npm run agent:list 854 855 # Monitor task processing 856 watch -n 5 'npm run agent:tasks' 857 ``` 858 859 **Database query:** 860 861 ```sql 862 SELECT 863 agent_name, 864 COUNT(*) as processing_count, 865 GROUP_CONCAT(id) as task_ids 866 FROM agent_tasks 867 WHERE status = 'running' 868 GROUP BY agent_name; 869 ``` 870 871 ### Limitations 872 873 **SQLite Concurrency:** 874 875 - WAL mode recommended for high concurrency 876 - ~10 concurrent writers is safe limit 877 - Consider PostgreSQL for >10 instances 878 879 **Cost Considerations:** 880 881 - Each instance makes LLM API calls 882 - Budget enforcement: `AGENT_DAILY_BUDGET=10` (USD) 883 - Emergency shutdown if >$5/hour spend rate 884 885 ### Best Practices 886 887 1. **Start with 2-3 instances** - Verify row-level locking works correctly 888 2. **Monitor task completion** - Ensure no duplicate processing 889 3. **Check database locks** - Avoid SQLite contention 890 4. **Scale gradually** - Add instances as queue depth increases 891 5. **Use priority wisely** - High-priority tasks processed first 892 893 ### Testing Concurrent Locking 894 895 ```bash 896 # Run concurrent locking tests 897 npm test tests/agents/concurrent-locking.test.js 898 ``` 899 900 Tests verify: 901 902 - No duplicate task processing 903 - Correct priority ordering 904 - Work distribution across instances 905 - Agent isolation (developer vs QA) 906 - Backwards compatibility with single instance 907 908 --- 909 910 ## CLI Commands 911 912 ### View Agent Status 913 914 ```bash 915 # List all agents with current status 916 npm run agent:list 917 918 # Output: 919 # Agent: monitor, Status: idle, Last run: 2026-02-15 10:25:00 920 # Agent: developer, Status: running, Current task: 42 921 # Circuit breaker: All agents operational 922 ``` 923 924 ### Manage Tasks 925 926 ```bash 927 # View all pending tasks 928 npm run agent:tasks 929 930 # View tasks for specific agent 931 npm run agent:tasks -- --assigned-to developer 932 933 # View tasks by status 934 npm run agent:tasks -- --status pending 935 npm run agent:tasks -- --status awaiting_po_approval 936 937 # View specific task details 938 npm run agent:tasks -- --task-id 42 939 ``` 940 941 ### Create Tasks Manually 942 943 ```bash 944 # Create task for developer 945 npm run agent:create -- --agent developer --task fix_bug --context '{"error":"...","file":"src/scoring.js"}' --priority 7 946 947 # Create task for QA 948 npm run agent:create -- --agent qa --task write_test --context '{"module":"scoring","function":"calculateScore"}' --priority 5 949 ``` 950 951 ### Trigger Workflows 952 953 ```bash 954 # Bug fix workflow 955 npm run agent:workflow -- --workflow bug-fix --error "TypeError: Cannot read property 'score' of null" --stage scoring 956 957 # Feature workflow 958 npm run agent:workflow -- --workflow feature --description "Add export to CSV" --requirements '["Export button","CSV format","Download trigger"]' 959 960 # Refactor workflow 961 npm run agent:workflow -- --workflow refactor --file src/complex-module.js --reason "Cyclomatic complexity exceeds 15" 962 ``` 963 964 ### Manage Approvals 965 966 ```bash 967 # View all pending approvals 968 npm run agent:approvals 969 970 # Filter by approval type 971 npm run agent:approvals -- --status awaiting_po_approval 972 npm run agent:approvals -- --status awaiting_architect_approval 973 974 # Approve task 975 npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision approved 976 977 # Approve with conditions 978 npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision approved_with_conditions --notes "Keep it simple" --conditions "Max 2 files,No new dependencies" 979 980 # Reject task 981 npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision rejected --notes "Scope too large, break into smaller pieces" 982 ``` 983 984 ### View Workflow Status 985 986 ```bash 987 # View workflow tree (parent/child tasks) 988 npm run agent:workflow:status -- --workflow-id 42 989 990 # Output shows task hierarchy and status 991 ``` 992 993 ### View Logs 994 995 ```bash 996 # View all agent logs 997 npm run agent:logs 998 999 # Filter by agent 1000 npm run agent:logs -- --agent-name developer 1001 1002 # Filter by task 1003 npm run agent:logs -- --task-id 42 1004 1005 # Filter by level 1006 npm run agent:logs -- --level error 1007 npm run agent:logs -- --agent-name developer --level error 1008 ``` 1009 1010 ### View Statistics 1011 1012 ```bash 1013 # View success rates and metrics 1014 npm run agent:stats 1015 1016 # Output: 1017 # Agent: developer, Tasks: 45, Success: 42, Failure: 3, Rate: 93% 1018 # Agent: qa, Tasks: 38, Success: 38, Failure: 0, Rate: 100% 1019 # Circuit breaker: All agents operational 1020 ``` 1021 1022 ### Run Agents Manually 1023 1024 ```bash 1025 # Run all agents once 1026 npm run agent:run 1027 1028 # Run with verbose logging 1029 npm run agent:run -- --verbose 1030 1031 # Process up to N tasks 1032 npm run agent:run -- --tasks=10 1033 1034 # Run single agent 1035 npm run agent:run:single 1036 ``` 1037 1038 --- 1039 1040 ## Configuration 1041 1042 ### Environment Variables 1043 1044 ```bash 1045 # Enable/disable agent system 1046 AGENT_SYSTEM_ENABLED=true 1047 1048 # Circuit breaker threshold (30% failure rate triggers disable) 1049 AGENT_CIRCUIT_BREAKER_THRESHOLD=0.3 1050 1051 # Rate limit (max invocations per hour) 1052 AGENT_MAX_INVOCATIONS_PER_HOUR=60 1053 1054 # Immediate invocation (default: true) 1055 # Event-driven agent invocation eliminates 5-minute cron delays 1056 # Agents invoke each other immediately after handoffs and task creation 1057 # Speeds up workflows 10-15x (from 15-20 min to < 2 min) 1058 # See docs/IMMEDIATE-INVOCATION.md for details 1059 AGENT_IMMEDIATE_INVOCATION=true 1060 1061 # Max chain depth (default: 10) 1062 # Prevents infinite loops by limiting consecutive immediate invocations 1063 # After reaching depth, agents fall back to cron polling 1064 AGENT_MAX_CHAIN_DEPTH=10 1065 1066 # Database path (for testing) 1067 DATABASE_PATH=./db/sites.db 1068 ``` 1069 1070 ### Quality Gates 1071 1072 **Developer Agent:** 1073 1074 - **Coverage gate:** 85%+ required BEFORE commits (HARD BLOCK) 1075 - Automatic test generation attempted if coverage <85% 1076 - Escalates to Architect if auto-fix fails 1077 1078 **QA Agent:** 1079 1080 - **Coverage gate:** 80%+ required to approve tasks (HARD BLOCK) 1081 - Creates `write_missing_tests` task if coverage <80% 1082 - Blocks parent task until coverage improves 1083 1084 **Other Gates:** 1085 1086 - **Retry limit:** 3 retries per task before marking as failed 1087 - **Task TTL:** Tasks pending >1 hour escalate to human review 1088 - **Circuit breaker:** >30% failure rate disables agent 1089 1090 ### Scheduling 1091 1092 **Immediate Invocation** (Event-Driven): 1093 1094 Agents are invoked immediately when: 1095 1096 - Another agent hands off a task (`handoff()`) 1097 - A new task is created (`createTask()`) 1098 1099 This eliminates 5-minute cron delays, speeding up workflows **10-15x** (from 15-20 minutes to < 2 minutes). 1100 1101 See [IMMEDIATE-INVOCATION.md](IMMEDIATE-INVOCATION.md) for details. 1102 1103 **Cron Fallback** (Scheduled Polling): 1104 1105 Agents also run via cron job every 5 minutes as a safety net: 1106 1107 ```sql 1108 -- cron_jobs table entry 1109 INSERT INTO cron_jobs (name, schedule, handler, enabled) 1110 VALUES ('agent-runner', '*/5 * * * *', 'node src/agents/runner.js', 1); 1111 ``` 1112 1113 Cron picks up tasks that were missed by immediate invocation (e.g., due to errors or depth limits). 1114 1115 **Manual control:** 1116 1117 - Start: Set `enabled = 1` in cron_jobs 1118 - Stop: Set `enabled = 0` in cron_jobs 1119 - One-time run: `npm run agent:run` 1120 1121 --- 1122 1123 ## Safety Features 1124 1125 ### Circuit Breaker 1126 1127 **Purpose:** Prevent runaway agent failures from consuming resources 1128 1129 **How it works:** 1130 1131 1. Monitors agent success/failure ratios 1132 2. If failure rate >30% (and ≥10 tasks completed): Trigger circuit breaker 1133 3. Agent status → `blocked` 1134 4. Timestamp recorded in `agent_state.metrics_json` 1135 5. Manual reset required 1136 1137 **When triggered:** 1138 1139 - Agent logged to `human_review_queue` 1140 - All tasks for that agent paused 1141 - Root cause investigation required 1142 1143 **Reset:** 1144 1145 ```sql 1146 UPDATE agent_state 1147 SET status = 'idle', 1148 metrics_json = json_remove(metrics_json, '$.circuit_breaker_triggered_at') 1149 WHERE agent_name = 'developer'; 1150 ``` 1151 1152 ### Escalation to Human Review 1153 1154 Tasks automatically escalate to `human_review_queue` for: 1155 1156 - Database schema changes 1157 - Breaking API changes 1158 - Security-sensitive changes (auth, secrets, compliance) 1159 - Circuit breaker triggers 1160 - Stale tasks (pending >1 hour) 1161 - Failed tasks after 3 retries 1162 1163 **Review queue:** 1164 1165 ```bash 1166 # View human review items 1167 npm run agent:approvals 1168 1169 # Approve/reject from queue 1170 npm run agent:approve -- --task-id <id> --reviewer "Name" --decision approved|rejected 1171 ``` 1172 1173 ### Audit Trail 1174 1175 Complete tracking of all agent actions: 1176 1177 **agent_logs table:** 1178 1179 - Every task execution logged with level (info, warning, error) 1180 - Metadata includes context, decisions, file paths 1181 1182 **agent_messages table:** 1183 1184 - All inter-agent communication recorded 1185 - Message types: handoff, question, answer, notification 1186 1187 **agent_tasks table:** 1188 1189 - Task status changes tracked 1190 - Retry attempts logged 1191 - Result stored in result_json 1192 1193 **agent_state table:** 1194 1195 - Agent status changes 1196 - Metrics tracked (success/failure rates) 1197 - Last run timestamps 1198 1199 ### Rollback Protection 1200 1201 **Before making changes:** 1202 1203 1. Developer agent checks coverage 1204 2. Architect reviews implementation plan 1205 3. QA verifies changes don't break tests 1206 1207 **If something breaks:** 1208 1209 1. Monitor detects errors in logs 1210 2. Triage classifies and routes 1211 3. Developer creates fix 1212 4. Workflow repeats with proper gates 1213 1214 **Manual rollback:** 1215 1216 ```bash 1217 # View recent changes 1218 git log --oneline -5 1219 1220 # Rollback if needed 1221 git revert <commit-hash> 1222 1223 # Trigger QA verification 1224 npm run agent:create -- --agent qa --task verify_fix --context '{"commit":"..."}' --priority 10 1225 ``` 1226 1227 --- 1228 1229 ## Cost Management 1230 1231 ### Token Usage Reduction 1232 1233 **Monolithic approach:** 100-150KB per invocation (full CLAUDE.md) 1234 **Multi-agent approach:** 20-25KB per invocation (base + role context) 1235 **Reduction:** 75-85% 1236 1237 ### Breakdown by Agent 1238 1239 | Agent | Context Size | Tokens/Invocation | Reduction | 1240 | --------- | ------------ | ----------------- | --------- | 1241 | Monitor | 20KB | ~5,000 | 80% | 1242 | Triage | 23.5KB | ~6,000 | 76% | 1243 | Developer | 21.3KB | ~5,300 | 79% | 1244 | QA | 23KB | ~5,800 | 77% | 1245 | Security | 21KB | ~5,200 | 79% | 1246 | Architect | 25KB | ~6,200 | 75% | 1247 1248 ### Rate Limiting 1249 1250 **Environment variable:** 1251 1252 ```bash 1253 AGENT_MAX_INVOCATIONS_PER_HOUR=60 1254 ``` 1255 1256 **How it works:** 1257 1258 - Tracks invocations per hour in `agent_state.metrics_json` 1259 - If limit exceeded: Agent status → `blocked` 1260 - Resets every hour 1261 1262 **Monitoring:** 1263 1264 ```bash 1265 # Check invocation counts 1266 npm run agent:stats 1267 1268 # View recent logs 1269 npm run agent:logs -- --agent-name developer 1270 ``` 1271 1272 ### Budget Controls 1273 1274 **Prevent cost overruns:** 1275 1276 1. **Set rate limits:** `AGENT_MAX_INVOCATIONS_PER_HOUR=60` 1277 2. **Monitor stats:** `npm run agent:stats` daily 1278 3. **Review logs:** Check for unnecessary task creation 1279 4. **Optimize context:** Keep context files lean and focused 1280 5. **Use Haiku for simple tasks:** `AGENT_USE_HAIKU_FOR_SIMPLE_TASKS=true` (50-70% cost reduction) 1281 1282 ### Smart Model Selection (Haiku vs Sonnet) 1283 1284 **Cost optimization:** The agent system automatically selects the appropriate model based on task complexity. 1285 1286 **Cost comparison:** 1287 1288 | Model | Input | Output | Use Case | 1289 | ----------------- | -------------- | ----------------- | ------------------------------------- | 1290 | Claude 3.5 Haiku | $0.80/M | $4.00/M | Simple pattern-based tasks | 1291 | Claude 3.5 Sonnet | $3.00/M | $15.00/M | Complex reasoning & code generation | 1292 | **Cost Savings** | **4x cheaper** | **3.75x cheaper** | **50-70% reduction for simple tasks** | 1293 1294 **Haiku tasks (simple/pattern-based):** 1295 1296 - **Triage:** Error classification via pattern matching 1297 - **Monitor:** Log scanning and anomaly detection 1298 - **Security:** Regex-based security checks (SQL injection, secrets, command injection patterns) 1299 - **QA:** Test file discovery and simple test generation 1300 1301 **Sonnet tasks (complex reasoning):** 1302 1303 - **Developer:** Bug fixing and code generation 1304 - **Architect:** Design reviews and architectural decisions 1305 - **Security:** Advanced threat modeling (STRIDE analysis) 1306 - **QA:** Coverage analysis and complex integration tests 1307 1308 **Configuration:** 1309 1310 ```bash 1311 # Enable Haiku optimization (default: true for 50-70% cost reduction) 1312 AGENT_USE_HAIKU_FOR_SIMPLE_TASKS=true 1313 ``` 1314 1315 **Override model selection:** 1316 1317 ```javascript 1318 // Force Haiku 1319 const result = await classifyIssue(agentName, taskId, errorMessage, { 1320 model: 'claude-3-5-haiku-20241022', 1321 }); 1322 1323 // Force Sonnet 1324 const result = await analyzeCode(agentName, taskId, filePath, prompt, { 1325 model: 'claude-3-5-sonnet-20241022', 1326 complexity: 'complex', 1327 }); 1328 ``` 1329 1330 **Track cost savings:** 1331 1332 ```bash 1333 npm run agent:stats 1334 ``` 1335 1336 Output includes model breakdown: 1337 1338 ```json 1339 { 1340 "modelBreakdown": { 1341 "haiku": { 1342 "calls": 150, 1343 "cost": 0.45, 1344 "avgCost": 0.003 1345 }, 1346 "sonnet": { 1347 "calls": 50, 1348 "cost": 1.2, 1349 "avgCost": 0.024 1350 }, 1351 "savings": "27.3" // Percent of total cost from Haiku 1352 } 1353 } 1354 ``` 1355 1356 **Expected savings:** 1357 1358 - Monitor/Triage agents: 60-70% cost reduction (mostly Haiku) 1359 - Developer/Architect: 10-20% cost reduction (mostly Sonnet) 1360 - Security: 30-40% cost reduction (mix of simple checks and complex modeling) 1361 - QA: 40-50% cost reduction (test generation uses Haiku, coverage analysis uses Sonnet) 1362 1363 5. **Use circuit breakers:** Prevent runaway failures 1364 1365 **Cost estimation:** 1366 1367 - Average task: ~6,000 tokens input + ~2,000 tokens output = 8,000 tokens 1368 - At 60 invocations/hour: ~480,000 tokens/hour 1369 - At $3/M tokens (Sonnet): ~$1.44/hour 1370 - Daily cost (24 hours): ~$35 1371 1372 **Cost optimization tips:** 1373 1374 - Reduce task creation frequency if logs are clean 1375 - Increase Monitor scan interval from 5 to 10 minutes 1376 - Disable agents not currently needed 1377 - Use smaller models for simple tasks (Haiku for classification) 1378 1379 --- 1380 1381 ## Troubleshooting 1382 1383 ### Agent Not Processing Tasks 1384 1385 **Symptoms:** 1386 1387 - Tasks stuck in `pending` status 1388 - Agent status shows `blocked` 1389 - No recent logs for agent 1390 1391 **Diagnosis:** 1392 1393 ```bash 1394 # Check agent status 1395 npm run agent:list 1396 1397 # Check circuit breaker 1398 npm run agent:stats 1399 1400 # View error logs 1401 npm run agent:logs -- --agent-name developer --level error 1402 ``` 1403 1404 **Solutions:** 1405 1406 1. **Circuit breaker triggered:** 1407 1408 ```sql 1409 -- Check metrics 1410 SELECT metrics_json FROM agent_state WHERE agent_name = 'developer'; 1411 1412 -- Reset if safe 1413 UPDATE agent_state 1414 SET status = 'idle', 1415 metrics_json = json_remove(metrics_json, '$.circuit_breaker_triggered_at') 1416 WHERE agent_name = 'developer'; 1417 ``` 1418 1419 2. **Rate limit exceeded:** 1420 1421 ```bash 1422 # Wait for hourly reset, or increase limit 1423 AGENT_MAX_INVOCATIONS_PER_HOUR=120 1424 ``` 1425 1426 3. **Agent disabled:** 1427 1428 ```sql 1429 -- Re-enable agent 1430 UPDATE agent_state SET status = 'idle' WHERE agent_name = 'developer'; 1431 ``` 1432 1433 ### Tasks Stuck in Pending 1434 1435 **Symptoms:** 1436 1437 - Tasks created but never start 1438 - Task age >1 hour 1439 1440 **Diagnosis:** 1441 1442 ```bash 1443 # View pending tasks 1444 npm run agent:tasks -- --status pending 1445 1446 # Check if agents are running 1447 npm run agent:list 1448 1449 # Check task dependencies 1450 npm run agent:workflow:status -- --workflow-id 42 1451 ``` 1452 1453 **Solutions:** 1454 1455 1. **Parent task incomplete:** 1456 - Tasks with `parent_task_id` won't start until parent completes 1457 - Check parent status: `npm run agent:tasks -- --task-id <parent_id>` 1458 - Complete or cancel parent task 1459 1460 2. **Agent not running:** 1461 - Check cron job enabled: `SELECT * FROM cron_jobs WHERE name = 'agent-runner';` 1462 - Enable: `UPDATE cron_jobs SET enabled = 1 WHERE name = 'agent-runner';` 1463 - Manual run: `npm run agent:run` 1464 1465 3. **Task priority too low:** 1466 - Increase priority: `UPDATE agent_tasks SET priority = 10 WHERE id = 42;` 1467 1468 ### High Token Costs 1469 1470 **Symptoms:** 1471 1472 - Higher than expected API bills 1473 - Many agent invocations 1474 1475 **Diagnosis:** 1476 1477 ```bash 1478 # Check invocation counts 1479 SELECT agent_name, COUNT(*) as invocations 1480 FROM agent_logs 1481 WHERE created_at > datetime('now', '-1 hour') 1482 GROUP BY agent_name; 1483 1484 # Check task creation rate 1485 SELECT task_type, COUNT(*) as count 1486 FROM agent_tasks 1487 WHERE created_at > datetime('now', '-24 hours') 1488 GROUP BY task_type; 1489 ``` 1490 1491 **Solutions:** 1492 1493 1. **Reduce invocation frequency:** 1494 1495 ```bash 1496 # Lower rate limit 1497 AGENT_MAX_INVOCATIONS_PER_HOUR=30 1498 1499 # Increase Monitor scan interval 1500 # Edit cron_jobs: '*/5 * * * *' → '*/10 * * * *' 1501 ``` 1502 1503 2. **Optimize context:** 1504 - Review context files for unnecessary content 1505 - Remove duplicate information 1506 - Keep context files lean 1507 1508 3. **Disable unnecessary agents:** 1509 1510 ```sql 1511 -- Temporarily disable Security agent 1512 UPDATE agent_state SET status = 'disabled' WHERE agent_name = 'security'; 1513 ``` 1514 1515 ### Circuit Breaker Triggered 1516 1517 **Symptoms:** 1518 1519 - Agent status = `blocked` 1520 - `circuit_breaker_triggered_at` in metrics_json 1521 1522 **Diagnosis:** 1523 1524 ```bash 1525 # View error logs 1526 npm run agent:logs -- --agent-name developer --level error 1527 1528 # Check failure rate 1529 npm run agent:stats 1530 ``` 1531 1532 **Solutions:** 1533 1534 1. **Identify root cause:** 1535 - Review error logs for patterns 1536 - Check recent code changes 1537 - Verify external dependencies (DB, APIs) 1538 1539 2. **Fix underlying issue:** 1540 - If code bug: Fix and test 1541 - If external issue: Wait for resolution 1542 - If config issue: Update configuration 1543 1544 3. **Reset circuit breaker:** 1545 1546 ```sql 1547 -- Only after fixing root cause! 1548 UPDATE agent_state 1549 SET status = 'idle', 1550 metrics_json = json_remove(metrics_json, '$.circuit_breaker_triggered_at') 1551 WHERE agent_name = 'developer'; 1552 ``` 1553 1554 ### Tasks Failing Repeatedly 1555 1556 **Symptoms:** 1557 1558 - Task retry_count = 3 1559 - Task status = `failed` 1560 - Same error in multiple tasks 1561 1562 **Diagnosis:** 1563 1564 ```bash 1565 # View failed tasks 1566 SELECT * FROM agent_tasks WHERE status = 'failed' ORDER BY created_at DESC LIMIT 10; 1567 1568 # Check error patterns 1569 npm run agent:logs -- --level error 1570 ``` 1571 1572 **Solutions:** 1573 1574 1. **Code issue:** 1575 - Manually fix the bug 1576 - Reset task: `UPDATE agent_tasks SET retry_count = 0, status = 'pending' WHERE id = 42;` 1577 1578 2. **Missing dependencies:** 1579 - Install required packages: `npm install` 1580 - Update environment variables 1581 1582 3. **Task too complex:** 1583 - Break into smaller subtasks 1584 - Provide more context in context_json 1585 1586 --- 1587 1588 ## Best Practices 1589 1590 ### When to Use Agents 1591 1592 **✅ Use agents for:** 1593 1594 - Automated bug fixes from error logs 1595 - Test generation for new features 1596 - Security audits on commits 1597 - Documentation freshness checks 1598 - Refactoring suggestions 1599 - Routine maintenance tasks 1600 1601 **❌ Don't use agents for:** 1602 1603 - Quick one-off tasks (just do it manually) 1604 - Tasks requiring complex user input 1605 - Real-time user interactions 1606 - Tasks with high uncertainty (needs human judgment) 1607 - Exploratory work without clear goals 1608 1609 ### Task Design 1610 1611 **Be specific:** 1612 1613 ```json 1614 { 1615 "error": "TypeError: Cannot read property 'score' of null", 1616 "file": "src/scoring.js", 1617 "line": 42, 1618 "stack": "..." 1619 } 1620 ``` 1621 1622 **Include context:** 1623 1624 ```json 1625 { 1626 "files_changed": ["src/scoring.js", "src/utils/error-handler.js"], 1627 "related_issues": ["Issue #123"], 1628 "previous_attempts": ["Tried null check, still failing"] 1629 } 1630 ``` 1631 1632 **Set appropriate priority:** 1633 1634 - 10: Critical (system down, security breach) 1635 - 7-9: High (blocking issue, major bug) 1636 - 4-6: Medium (normal bugs, features) 1637 - 1-3: Low (nice-to-haves, refactoring) 1638 1639 **Link parent tasks:** 1640 1641 ```javascript 1642 await createTask({ 1643 task_type: 'verify_fix', 1644 assigned_to: 'qa', 1645 parent_task_id: 123, // Links to fix_bug task 1646 priority: 5, 1647 }); 1648 ``` 1649 1650 ### Message Design 1651 1652 **Use handoff for task completion:** 1653 1654 ```javascript 1655 await agent.sendMessage(taskId, 'qa', 'handoff', 'Bug fix complete, ready for verification', { 1656 commit: 'abc123', 1657 files_changed: ['src/scoring.js'], 1658 }); 1659 ``` 1660 1661 **Use questions for clarification:** 1662 1663 ```javascript 1664 await agent.askQuestion( 1665 taskId, 1666 'developer', 1667 'Should this handle mobile and desktop screenshots differently?' 1668 ); 1669 ``` 1670 1671 **Use notifications for FYI:** 1672 1673 ```javascript 1674 await agent.sendMessage( 1675 taskId, 1676 'architect', 1677 'notification', 1678 'Coverage gate blocked commit due to <85% coverage', 1679 { current_coverage: 78, required: 85 } 1680 ); 1681 ``` 1682 1683 ### Agent Development 1684 1685 **Keep agents focused:** 1686 1687 - Single responsibility principle 1688 - One agent = one clear role 1689 - Don't create "do everything" agents 1690 1691 **Log liberally:** 1692 1693 ```javascript 1694 await this.log(taskId, 'info', 'Starting bug fix analysis'); 1695 await this.log(taskId, 'info', 'Identified affected files', { files: [...] }); 1696 await this.log(taskId, 'info', 'Generated fix, checking coverage'); 1697 ``` 1698 1699 **Fail gracefully:** 1700 1701 ```javascript 1702 try { 1703 const result = await this.analyzeBug(task); 1704 return result; 1705 } catch (error) { 1706 await this.log(task.id, 'error', 'Bug analysis failed', { error: error.message }); 1707 await this.failTask(task.id, { reason: 'Analysis failed', error: error.message }); 1708 return null; // Return partial results if possible 1709 } 1710 ``` 1711 1712 **Validate inputs:** 1713 1714 ```javascript 1715 async processTask(task) { 1716 const { context_json } = task; 1717 const context = JSON.parse(context_json); 1718 1719 // Validate required fields 1720 if (!context.error || !context.file) { 1721 await this.failTask(task.id, { reason: 'Missing required context fields' }); 1722 return; 1723 } 1724 1725 // Continue processing... 1726 } 1727 ``` 1728 1729 **Test thoroughly:** 1730 1731 - Unit tests for agent logic 1732 - Integration tests for workflows 1733 - Test error handling paths 1734 - Verify circuit breaker behavior 1735 1736 ### Monitoring and Maintenance 1737 1738 **Daily checks:** 1739 1740 ```bash 1741 # Check agent health 1742 npm run agent:stats 1743 1744 # Review errors 1745 npm run agent:logs -- --level error 1746 1747 # Check approval queue 1748 npm run agent:approvals 1749 ``` 1750 1751 **Weekly reviews:** 1752 1753 - Review circuit breaker triggers (if any) 1754 - Analyze token usage trends 1755 - Check task completion rates 1756 - Review escalated items 1757 1758 **Monthly optimization:** 1759 1760 - Analyze agent effectiveness 1761 - Optimize context files 1762 - Update agent logic based on patterns 1763 - Review and update approval thresholds 1764 1765 --- 1766 1767 ## Additional Resources 1768 1769 - **Agent System Architecture:** `/home/jason/code/333Method/docs/06-automation/agent-system.md` 1770 - **Workflow System:** `/home/jason/code/333Method/docs/06-automation/agent-workflow.md` 1771 - **Base Agent Code:** `/home/jason/code/333Method/src/agents/base-agent.js` 1772 - **Agent Implementations:** `/home/jason/code/333Method/src/agents/` 1773 - **Context Files:** `/home/jason/code/333Method/src/agents/contexts/` 1774 - **CLI Manager:** `/home/jason/code/333Method/src/cli/agent-manager.js` 1775 - **Database Schema:** `/home/jason/code/333Method/db/schema.sql` 1776 - **Migrations:** `/home/jason/code/333Method/db/migrations/041-create-agent-system.sql` 1777 1778 --- 1779 1780 **Last Updated:** 2026-02-15 1781 **Version:** 1.0 1782 **Status:** Production-ready