/ docs / 90-archive / AGENTS.md
AGENTS.md
   1  # Multi-Agent System Guide
   2  
   3  ## Table of Contents
   4  
   5  - [Overview](#overview)
   6  - [Architecture](#architecture)
   7  - [Getting Started](#getting-started)
   8  - [Agents](#agents)
   9  - [Workflows](#workflows)
  10  - [CLI Commands](#cli-commands)
  11  - [Configuration](#configuration)
  12  - [Safety Features](#safety-features)
  13  - [Cost Management](#cost-management)
  14  - [Troubleshooting](#troubleshooting)
  15  - [Best Practices](#best-practices)
  16  
  17  ---
  18  
  19  ## Overview
  20  
  21  The 333 Method uses a database-driven multi-agent system where specialized AI agents collaborate autonomously to handle development, testing, security, and architecture tasks.
  22  
  23  ### Benefits
  24  
  25  - **Token efficiency**: 75-85% reduction vs monolithic approach (20-25KB per invocation vs 100-150KB)
  26  - **Specialization**: Each agent has focused responsibilities and optimized context
  27  - **Peer review**: Built-in workflows ensure quality through agent collaboration
  28  - **Autonomy**: Agents work continuously via cron scheduling
  29  - **Audit trail**: Complete tracking of all agent actions and decisions
  30  
  31  ### How It Works
  32  
  33  1. **Monitor Agent** scans logs every 5 minutes and creates tasks for detected issues
  34  2. **Triage Agent** classifies errors and routes tasks to appropriate agents
  35  3. **Developer Agent** fixes bugs and implements features
  36  4. **QA Agent** verifies fixes and enforces test coverage gates
  37  5. **Security Agent** performs security reviews and compliance checks
  38  6. **Architect Agent** reviews designs and maintains documentation freshness
  39  
  40  Agents communicate through a database-driven message queue, creating a collaborative workflow where each agent builds on others' work.
  41  
  42  ---
  43  
  44  ## Architecture
  45  
  46  ### Core Components
  47  
  48  #### 1. Database Tables (Migration 041, 051)
  49  
  50  **agent_tasks** - Task queue with priority and status tracking
  51  
  52  ```sql
  53  CREATE TABLE agent_tasks (
  54    id INTEGER PRIMARY KEY AUTOINCREMENT,
  55    task_type TEXT NOT NULL,
  56    assigned_to TEXT NOT NULL,
  57    status TEXT NOT NULL,
  58    priority INTEGER DEFAULT 5,
  59    parent_task_id INTEGER,
  60    context_json TEXT,
  61    result_json TEXT,
  62    retry_count INTEGER DEFAULT 0,
  63    reviewed_by TEXT,
  64    approval_json TEXT,
  65    created_at TEXT DEFAULT CURRENT_TIMESTAMP
  66  );
  67  ```
  68  
  69  **agent_messages** - Inter-agent communication
  70  
  71  ```sql
  72  CREATE TABLE agent_messages (
  73    id INTEGER PRIMARY KEY AUTOINCREMENT,
  74    task_id INTEGER NOT NULL,
  75    from_agent TEXT NOT NULL,
  76    to_agent TEXT NOT NULL,
  77    message_type TEXT NOT NULL,
  78    message_text TEXT,
  79    metadata_json TEXT,
  80    created_at TEXT DEFAULT CURRENT_TIMESTAMP
  81  );
  82  ```
  83  
  84  **agent_logs** - Execution audit trail
  85  
  86  ```sql
  87  CREATE TABLE agent_logs (
  88    id INTEGER PRIMARY KEY AUTOINCREMENT,
  89    task_id INTEGER,
  90    agent_name TEXT NOT NULL,
  91    level TEXT NOT NULL,
  92    message TEXT NOT NULL,
  93    metadata_json TEXT,
  94    created_at TEXT DEFAULT CURRENT_TIMESTAMP
  95  );
  96  ```
  97  
  98  **agent_state** - Agent status and metrics
  99  
 100  ```sql
 101  CREATE TABLE agent_state (
 102    agent_name TEXT PRIMARY KEY,
 103    status TEXT NOT NULL,
 104    current_task_id INTEGER,
 105    last_run_at TEXT,
 106    metrics_json TEXT
 107  );
 108  ```
 109  
 110  **agent_outcomes** - Task outcomes for learning (Migration 052)
 111  
 112  ```sql
 113  CREATE TABLE agent_outcomes (
 114    id INTEGER PRIMARY KEY AUTOINCREMENT,
 115    task_id INTEGER NOT NULL REFERENCES agent_tasks(id) ON DELETE CASCADE,
 116    agent_name TEXT NOT NULL,
 117    task_type TEXT NOT NULL,
 118    outcome TEXT NOT NULL CHECK(outcome IN ('success', 'failure')),
 119    context_json TEXT,  -- Task-specific context (error_type, file_path, etc.)
 120    result_json TEXT,   -- Task result details (what worked, what didn't)
 121    duration_ms INTEGER,
 122    created_at DATETIME DEFAULT CURRENT_TIMESTAMP
 123  );
 124  ```
 125  
 126  This table enables **task history and learning** - agents learn from past successes and failures to improve future performance. See [docs/agents/task-history.md](./agents/task-history.md) for details.
 127  
 128  #### 2. Context Files
 129  
 130  Each agent loads a base context (~15KB) plus role-specific context:
 131  
 132  | Agent     | Context Files          | Total Size |
 133  | --------- | ---------------------- | ---------- |
 134  | Monitor   | base.md + monitor.md   | 20KB       |
 135  | Triage    | base.md + triage.md    | 23.5KB     |
 136  | Developer | base.md + developer.md | 21.3KB     |
 137  | QA        | base.md + qa.md        | 23KB       |
 138  | Security  | base.md + security.md  | 21KB       |
 139  | Architect | base.md + architect.md | 25KB       |
 140  
 141  **Location:** `/home/jason/code/333Method/src/agents/contexts/`
 142  
 143  #### 3. Agent Framework
 144  
 145  **BaseAgent class** (`src/agents/base-agent.js`)
 146  
 147  - Task polling and execution
 148  - Message sending/receiving
 149  - Logging and error handling
 150  - Circuit breaker integration
 151  
 152  **Utility modules:**
 153  
 154  - `context-loader.js` - Merges context files
 155  - `context-builder.js` - Enriches context with task history for learning
 156  - `task-manager.js` - CRUD operations for tasks
 157  - `message-manager.js` - Inter-agent messaging
 158  
 159  ### Workflow States
 160  
 161  Tasks progress through these states:
 162  
 163  ```
 164  pending → running → completed
 165 166           awaiting_po_approval → approved → pending
 167 168           awaiting_architect_approval → approved → pending
 169 170              failed
 171 172             blocked
 173  ```
 174  
 175  **State Descriptions:**
 176  
 177  - `pending` - Ready to work on
 178  - `running` - Currently being processed by an agent
 179  - `awaiting_po_approval` - Design proposal waiting for Product Owner sign-off
 180  - `awaiting_architect_approval` - Implementation plan waiting for technical review
 181  - `completed` - Successfully finished
 182  - `failed` - Failed after 3 retry attempts
 183  - `blocked` - Blocked on external dependency or human action
 184  
 185  ---
 186  
 187  ## Getting Started
 188  
 189  ### Prerequisites
 190  
 191  1. Database initialized with agent tables (migration 041, 051)
 192  2. Environment variable set: `AGENT_SYSTEM_ENABLED=true`
 193  3. Cron system enabled to run agents every 5 minutes
 194  
 195  ### Quick Start
 196  
 197  #### 1. Enable the Agent System
 198  
 199  ```bash
 200  # Add to .env
 201  echo "AGENT_SYSTEM_ENABLED=true" >> .env
 202  ```
 203  
 204  #### 2. Bootstrap the Monitor Agent
 205  
 206  The Monitor agent needs an initial task to start its self-scheduling loop:
 207  
 208  ```bash
 209  npm run agent:create -- --agent monitor --task scan_logs --context '{"incremental":true}' --priority 5
 210  ```
 211  
 212  #### 3. Verify Agents Are Running
 213  
 214  ```bash
 215  # Check agent status
 216  npm run agent:list
 217  
 218  # View pending tasks
 219  npm run agent:tasks
 220  
 221  # View recent logs
 222  npm run agent:logs -- --level info
 223  ```
 224  
 225  #### 4. Trigger a Test Workflow
 226  
 227  ```bash
 228  # Test bug fix workflow
 229  npm run agent:workflow -- --workflow bug-fix --error "Test error for verification" --stage scoring
 230  
 231  # Check workflow status
 232  npm run agent:tasks
 233  ```
 234  
 235  ---
 236  
 237  ## Agents
 238  
 239  ### 1. Monitor Agent
 240  
 241  **Role:** System immune system - proactive detection of issues
 242  
 243  **Responsibilities:**
 244  
 245  - Scan log files for ERROR/FATAL patterns every 5 minutes
 246  - Detect looping errors (same error >3x in 1 hour)
 247  - Monitor stale tasks (pending >1 hour)
 248  - Verify process compliance (expected stage transitions)
 249  - Track agent health (success/failure ratios)
 250  - Check documentation drift daily
 251  
 252  **Task Types:**
 253  
 254  - `scan_logs` - Incremental log scanning (self-scheduling)
 255  - `check_agent_health` - Monitor agent success rates
 256  - `check_process_compliance` - Verify workflow adherence
 257  - `check_doc_freshness` - Detect stale documentation
 258  
 259  **Context Size:** 20KB (base.md + monitor.md)
 260  
 261  **Self-Scheduling:** Creates new `scan_logs` task after each completion
 262  
 263  **Example:**
 264  
 265  ```bash
 266  # View Monitor status
 267  npm run agent:list | grep monitor
 268  
 269  # View Monitor logs
 270  npm run agent:logs -- --agent-name monitor
 271  ```
 272  
 273  ### 2. Triage Agent
 274  
 275  **Role:** Error classifier and task router
 276  
 277  **Responsibilities:**
 278  
 279  - Classify errors by type (null_pointer, network, database_constraint, api_error, security, configuration)
 280  - Determine severity (critical, high, medium, low)
 281  - Calculate priority (1-10 scale based on severity + impact)
 282  - Route tasks to appropriate agents
 283  - Suggest initial fix approaches
 284  
 285  **Task Types:**
 286  
 287  - `classify_error` - Analyze error and create appropriate task
 288  
 289  **Context Size:** 23.5KB (base.md + triage.md)
 290  
 291  **Routing Logic:**
 292  
 293  - Security errors → Security Agent (priority 10)
 294  - Database/network/API errors → Developer Agent
 295  - Complex architectural issues → Architect Agent
 296  - Configuration errors → Developer Agent
 297  
 298  **Example:**
 299  
 300  ```bash
 301  # Manually trigger triage
 302  npm run agent:create -- --agent triage --task classify_error --context '{"error":"TypeError: Cannot read property score of null","file":"src/scoring.js"}' --priority 7
 303  ```
 304  
 305  ### 3. Developer Agent
 306  
 307  **Role:** Bug fixes and feature implementation
 308  
 309  **Responsibilities:**
 310  
 311  - Analyze error messages and stack traces
 312  - Extract affected file paths
 313  - Generate bug fixes
 314  - Implement new features
 315  - **CRITICAL:** Enforce 85%+ code coverage before commits
 316  - Create git commits (only if coverage gate passes)
 317  - Hand off to QA for verification
 318  
 319  **Task Types:**
 320  
 321  - `fix_bug` - Analyze and fix bugs
 322  - `implement_feature` - Build new features
 323  - `implementation_plan` - Create detailed implementation plan
 324  
 325  **Context Size:** 21.3KB (base.md + developer.md)
 326  
 327  **Coverage Gate:**
 328  Developer enforces 85%+ coverage BEFORE creating commits:
 329  
 330  1. Make code changes
 331  2. Run `checkCoverageBeforeCommit(files, taskId)`
 332  3. If coverage <85%: Attempt automatic test generation
 333  4. If auto-fix fails: Escalate to Architect for guidance
 334  5. Only commit if coverage ≥85%
 335  
 336  **Workflow Example:**
 337  
 338  ```
 339  1. Receive fix_bug task from Triage
 340  2. Analyze error and identify affected files
 341  3. Generate fix
 342  4. Run coverage check
 343  5. If coverage passes: Create commit
 344  6. Create verify_fix task for QA
 345  7. Send handoff message to QA
 346  ```
 347  
 348  **Example:**
 349  
 350  ```bash
 351  # View Developer tasks
 352  npm run agent:tasks -- --assigned-to developer
 353  
 354  # Trigger bug fix
 355  npm run agent:workflow -- --workflow bug-fix --error "..." --file src/scoring.js
 356  ```
 357  
 358  ### 4. QA Agent
 359  
 360  **Role:** Test generation, verification, coverage enforcement
 361  
 362  **Responsibilities:**
 363  
 364  - Generate unit tests for new features
 365  - Verify bug fixes work correctly
 366  - Enforce 80%+ coverage gate (HARD BLOCK on task completion)
 367  - Run test suite and parse coverage reports
 368  - Create feedback for developers on failures
 369  - Tag regression tests
 370  
 371  **Task Types:**
 372  
 373  - `write_test` - Generate unit test
 374  - `verify_fix` - Verify bug fix works
 375  - `check_coverage` - Ensure 80%+ coverage
 376  - `write_missing_tests` - Fill coverage gaps
 377  
 378  **Context Size:** 23KB (base.md + qa.md)
 379  
 380  **Coverage Gate:**
 381  QA enforces 80%+ coverage AFTER commits as a second safety layer:
 382  
 383  1. Receive verify_fix task
 384  2. Run tests for changed files
 385  3. Check coverage with c8
 386  4. If <80%: Create write_missing_tests task, block parent task
 387  5. If ≥80%: Mark task complete
 388  
 389  **Example:**
 390  
 391  ```bash
 392  # View QA tasks
 393  npm run agent:tasks -- --assigned-to qa
 394  
 395  # Check recent verifications
 396  npm run agent:logs -- --agent-name qa --level info
 397  ```
 398  
 399  ### 5. Security Agent
 400  
 401  **Role:** Security audits, compliance, vulnerability scanning
 402  
 403  **Responsibilities:**
 404  
 405  - Code security reviews (SQL injection, XSS, command injection)
 406  - Dependency vulnerability scanning (`npm audit`)
 407  - Secrets detection (hardcoded keys, credentials)
 408  - TCPA/CAN-SPAM/GDPR compliance validation
 409  - Track vulnerability remediation time
 410  
 411  **Task Types:**
 412  
 413  - `audit_code` - Security code review
 414  - `scan_dependencies` - Check for vulnerable dependencies
 415  - `compliance_check` - Validate TCPA/CAN-SPAM adherence
 416  - `scan_secrets` - Detect exposed credentials
 417  
 418  **Context Size:** 21KB (base.md + security.md)
 419  
 420  **Example:**
 421  
 422  ```bash
 423  # Trigger security audit
 424  npm run agent:create -- --agent security --task audit_code --context '{"files":["src/outreach/sms.js"]}' --priority 8
 425  
 426  # View security findings
 427  npm run agent:logs -- --agent-name security --level error
 428  ```
 429  
 430  ### 6. Architect Agent
 431  
 432  **Role:** Design review, refactoring, documentation freshness
 433  
 434  **Responsibilities:**
 435  
 436  - Design reviews for new features
 437  - Refactoring suggestions based on complexity analysis
 438  - Code complexity monitoring (max 150 lines, complexity 15)
 439  - Documentation freshness checks
 440  - Schema change validation
 441  - Create Architecture Decision Records (ADRs)
 442  
 443  **Task Types:**
 444  
 445  - `design_proposal` - Create design document for significant changes
 446  - `technical_review` - Review implementation plans
 447  - `suggest_refactor` - Recommend refactoring
 448  - `update_documentation` - Fix stale docs
 449  - `review_design` - Evaluate feature designs
 450  
 451  **Context Size:** 25KB (base.md + architect.md)
 452  
 453  **Documentation Freshness Checks:**
 454  On every commit, Architect verifies:
 455  
 456  - New env vars → `.env.example` updated?
 457  - New npm scripts → `README.md` updated?
 458  - New modules → `CLAUDE.md` updated?
 459  - Schema changes → `db/schema.sql` + migration?
 460  - Features done → `docs/TODO.md` updated?
 461  
 462  **Example:**
 463  
 464  ```bash
 465  # Request design review
 466  npm run agent:create -- --agent architect --task design_proposal --context '{"feature":"Dark mode toggle","requirements":["Settings UI","Persistence","Global theme"]}' --priority 6
 467  
 468  # View pending reviews
 469  npm run agent:tasks -- --assigned-to architect --status awaiting_po_approval
 470  ```
 471  
 472  ---
 473  
 474  ## Task Routing
 475  
 476  The agent system uses a centralized task routing configuration to ensure tasks are always assigned to the correct agent.
 477  
 478  ### Routing Configuration
 479  
 480  **Location:** `src/agents/utils/task-routing.js`
 481  
 482  This module provides:
 483  
 484  - `TASK_ROUTING` - Complete mapping of task types to agents
 485  - `getAgentForTaskType(taskType)` - Get correct agent for a task type
 486  - `validateTaskAssignment(taskType, assignedTo)` - Validate task is correctly routed
 487  - `getTaskTypesForAgent(agentName)` - Get all task types an agent handles
 488  
 489  ### Complete Task Type Reference
 490  
 491  | Task Type                       | Agent     | Description                                    |
 492  | ------------------------------- | --------- | ---------------------------------------------- |
 493  | **Developer Tasks**             |           |                                                |
 494  | `fix_bug`                       | developer | Fix bugs identified by Triage                  |
 495  | `implement_feature`             | developer | Implement new features after design approval   |
 496  | `refactor_code`                 | developer | Refactor complex or problematic code           |
 497  | `apply_feedback`                | developer | Address feedback from other agents             |
 498  | `implementation_plan`           | developer | Create detailed implementation plan            |
 499  | **QA Tasks**                    |           |                                                |
 500  | `write_test`                    | qa        | Generate unit tests for code                   |
 501  | `verify_fix`                    | qa        | Verify bug fix works correctly                 |
 502  | `check_coverage`                | qa        | Check test coverage meets 80%+ requirement     |
 503  | `run_tests`                     | qa        | Run test suite for files                       |
 504  | **Security Tasks**              |           |                                                |
 505  | `audit_code`                    | security  | Security code review (SQL injection, XSS, etc) |
 506  | `scan_dependencies`             | security  | Check for vulnerable dependencies              |
 507  | `verify_compliance`             | security  | Validate TCPA/CAN-SPAM/GDPR compliance         |
 508  | `scan_secrets`                  | security  | Detect exposed credentials                     |
 509  | `threat_model`                  | security  | STRIDE threat modeling for component           |
 510  | `fix_security_issue`            | security  | Auto-fix security vulnerabilities              |
 511  | `review_dependency_update`      | security  | Review dependency updates for security         |
 512  | **Architect Tasks**             |           |                                                |
 513  | `design_proposal`               | architect | Create design proposal for features            |
 514  | `technical_review`              | architect | Review implementation plan for soundness       |
 515  | `review_design`                 | architect | Review design against principles               |
 516  | `suggest_refactor`              | architect | Suggest refactoring for complex code           |
 517  | `update_documentation`          | architect | Update documentation with Claude API           |
 518  | `check_documentation_freshness` | architect | Check for stale documentation                  |
 519  | `check_complexity`              | architect | Check code complexity metrics                  |
 520  | `audit_documentation`           | architect | Verify documentation matches reality           |
 521  | `check_branch_health`           | architect | Check for stale branches                       |
 522  | `profile_performance`           | architect | Profile pipeline performance                   |
 523  | `review_documentation`          | architect | Review documentation accuracy                  |
 524  | **Triage Tasks**                |           |                                                |
 525  | `classify_error`                | triage    | Classify error and route to agent              |
 526  | `route_task`                    | triage    | Route generic task to agent                    |
 527  | `prioritize_tasks`              | triage    | Prioritize pending tasks                       |
 528  | **Monitor Tasks**               |           |                                                |
 529  | `scan_logs`                     | monitor   | Scan logs for errors (self-scheduling)         |
 530  | `check_agent_health`            | monitor   | Monitor agent success rates                    |
 531  | `check_process_compliance`      | monitor   | Verify workflow adherence                      |
 532  | `detect_anomaly`                | monitor   | Detect anomalous behavior                      |
 533  | `check_pipeline_health`         | monitor   | Check pipeline for blockages                   |
 534  | `check_slo_compliance`          | monitor   | Check SLO compliance metrics                   |
 535  
 536  ### Auto-Delegation
 537  
 538  When an agent receives a task type it doesn't handle, it automatically delegates to the correct agent using `BaseAgent.delegateToCorrectAgent()`:
 539  
 540  **Example:** If `implement_feature` is mistakenly assigned to `monitor`:
 541  
 542  1. Monitor calls `delegateToCorrectAgent(task)`
 543  2. Creates new task assigned to `developer`
 544  3. Completes original task with delegation note
 545  4. Logs routing correction for analysis
 546  
 547  This prevents "Unknown task type" errors and ensures no tasks are lost due to misrouting.
 548  
 549  ### Common Routing Errors Fixed
 550  
 551  **Before (Errors):**
 552  
 553  - `implement_feature` → monitor, triage, qa, security, architect ❌
 554  - `fix_bug` → architect ❌
 555  - `review_documentation` → unknown ❌
 556  - `review_dependency_update` → unknown ❌
 557  
 558  **After (Correct Routing):**
 559  
 560  - `implement_feature` → developer ✅
 561  - `fix_bug` → developer ✅
 562  - `review_documentation` → architect ✅
 563  - `review_dependency_update` → security ✅
 564  
 565  ### Testing
 566  
 567  Run task routing tests:
 568  
 569  ```bash
 570  node --test tests/agents/task-routing.test.js
 571  ```
 572  
 573  This validates all task types are correctly mapped and delegation works properly.
 574  
 575  ---
 576  
 577  ## Workflows
 578  
 579  ### Standard Workflow Types
 580  
 581  #### 1. Feature Implementation (Significant)
 582  
 583  Used for breaking changes, database migrations, or features >4 hours effort.
 584  
 585  ```
 586  Product Request
 587 588  Architect: design_proposal
 589 590  Status: awaiting_po_approval
 591 592  PO Reviews → Approves/Rejects
 593     ↓ (approved)
 594  Developer: implementation_plan
 595 596  Status: awaiting_architect_approval
 597 598  Architect: technical_review → Approves/Rejects
 599     ↓ (approved)
 600  Developer: implement_feature
 601 602  QA: verify_fix
 603 604  Security: audit_code (if needed)
 605  ```
 606  
 607  **Example:**
 608  
 609  ```bash
 610  npm run agent:workflow -- --workflow feature --description "Add two-factor authentication" --requirements '["SMS OTP","Email backup codes","Recovery process"]'
 611  
 612  # View approval queue
 613  npm run agent:approvals -- --status awaiting_po_approval
 614  
 615  # Approve design
 616  npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision approved
 617  ```
 618  
 619  #### 2. Feature Implementation (Minor)
 620  
 621  Used for small features ≤4 hours, no breaking changes or migrations.
 622  
 623  ```
 624  Product Request
 625 626  Architect: design_proposal (auto-approved)
 627 628  Developer: implementation_plan
 629 630  Architect: technical_review
 631 632  Developer: implement_feature
 633 634  QA: verify_fix
 635  ```
 636  
 637  **Example:**
 638  
 639  ```bash
 640  npm run agent:workflow -- --workflow feature --description "Add logging to enrich stage" --requirements '["Log contact count","Log errors"]'
 641  ```
 642  
 643  #### 3. Bug Fix (Architectural)
 644  
 645  For bugs affecting multiple modules or requiring schema changes.
 646  
 647  ```
 648  Error Detected
 649 650  Triage: classify_error → architectural
 651 652  Architect: design_proposal
 653 654  Status: awaiting_po_approval
 655 656  PO Approves
 657 658  Developer: implementation_plan
 659 660  Architect: technical_review
 661 662  Developer: fix_bug
 663 664  QA: verify_fix
 665  ```
 666  
 667  #### 4. Bug Fix (Standard)
 668  
 669  For isolated bugs in a single file with low complexity.
 670  
 671  ```
 672  Error Detected
 673 674  Triage: classify_error → simple
 675 676  Developer: fix_bug
 677 678  QA: verify_fix
 679  ```
 680  
 681  **Example:**
 682  
 683  ```bash
 684  npm run agent:workflow -- --workflow bug-fix --error "TypeError: Cannot read property 'score' of null" --file src/scoring.js --stack "..."
 685  ```
 686  
 687  #### 5. Refactor Workflow
 688  
 689  For code complexity reduction or architectural improvements.
 690  
 691  ```
 692  Complexity Detected
 693 694  Architect: design_proposal
 695 696  Developer: implementation_plan
 697 698  Architect: technical_review
 699 700  Developer: implement refactoring
 701 702  QA: verify_fix (ensure no regressions)
 703  ```
 704  
 705  **Example:**
 706  
 707  ```bash
 708  npm run agent:workflow -- --workflow refactor --file src/complex-module.js --reason "Cyclomatic complexity exceeds 15"
 709  ```
 710  
 711  ### Approval System
 712  
 713  #### Product Owner Approval
 714  
 715  **Required for:**
 716  
 717  - Breaking changes
 718  - Database migrations
 719  - Features with >4 hours estimated effort
 720  - Changes explicitly marked "significant"
 721  
 722  **Process:**
 723  
 724  1. Architect creates design_proposal task
 725  2. Task status → `awaiting_po_approval`
 726  3. Task appears in human_review_queue
 727  4. PO reviews via CLI: `npm run agent:approvals`
 728  5. PO approves/rejects via: `npm run agent:approve`
 729  
 730  **Approval Schema:**
 731  
 732  ```json
 733  {
 734    "decision": "approved | approved_with_conditions | rejected",
 735    "reviewer": "Jason",
 736    "timestamp": "2026-02-15T10:30:00Z",
 737    "notes": "Looks good, keep scope tight",
 738    "conditions": ["Max 2 files", "No new dependencies"]
 739  }
 740  ```
 741  
 742  #### Architect Approval
 743  
 744  **Required for:**
 745  
 746  - All implementation plans
 747  - Refactorings
 748  - Performance optimizations
 749  
 750  **Review Criteria:**
 751  
 752  - Files won't exceed 150 lines
 753  - Test coverage ≥85%
 754  - Documentation updated
 755  - No circular dependencies
 756  - Follows architectural patterns
 757  
 758  **Process:**
 759  
 760  1. Developer creates implementation_plan
 761  2. Task status → `awaiting_architect_approval`
 762  3. Architect agent reviews plan
 763  4. Creates technical_review task
 764  5. Approves → status back to `pending`, Developer proceeds
 765  6. Rejects → feedback to Developer, plan revised
 766  
 767  ---
 768  
 769  ## Horizontal Scaling
 770  
 771  The agent system supports horizontal scaling through row-level task locking, allowing multiple instances of the same agent to run concurrently without conflicts.
 772  
 773  ### How It Works
 774  
 775  **Row-Level Locking:**
 776  
 777  - Each agent instance atomically claims individual tasks from the database
 778  - SQLite transactions ensure only one instance can claim any given task
 779  - Multiple instances safely process different tasks simultaneously
 780  - No duplicate processing, even with 5+ concurrent instances
 781  
 782  **Configuration:**
 783  
 784  ```env
 785  # Enable row-level locking (default: true)
 786  AGENT_ENABLE_ROW_LOCKING=true
 787  
 788  # Allow concurrent instances of same agent (default: false)
 789  AGENT_ALLOW_CONCURRENT_INSTANCES=true
 790  ```
 791  
 792  ### Running Multiple Instances
 793  
 794  **Example: 3 Developer Agents:**
 795  
 796  ```bash
 797  # Terminal 1
 798  npm run agent:run:single developer &
 799  
 800  # Terminal 2
 801  npm run agent:run:single developer &
 802  
 803  # Terminal 3
 804  npm run agent:run:single developer &
 805  ```
 806  
 807  All three instances will process different tasks concurrently. Work distribution is automatic and race-condition safe.
 808  
 809  ### Performance Benefits
 810  
 811  **Task Throughput:**
 812  
 813  - 1 developer agent: ~5-10 tasks/hour (depending on complexity)
 814  - 3 developer agents: ~15-30 tasks/hour (3x throughput)
 815  - 5 developer agents: ~25-50 tasks/hour (5x throughput, diminishing returns)
 816  
 817  **When to Scale:**
 818  
 819  - High task queue depth (>20 pending tasks)
 820  - Long-running tasks (>5 minutes each)
 821  - Time-sensitive workflows (critical bug fixes)
 822  
 823  ### Safety Mechanisms
 824  
 825  **Agent-Level Locking (Optional):**
 826  
 827  ```env
 828  # Disable for horizontal scaling
 829  AGENT_ALLOW_CONCURRENT_INSTANCES=true
 830  ```
 831  
 832  When disabled (default), only one instance of each agent runs at a time (backwards compatible).
 833  
 834  **Task States:**
 835  
 836  - `pending` → Available for claiming
 837  - `running` → Claimed by an instance (atomic transition)
 838  - `completed` → Finished successfully
 839  - `failed` → Error after max retries
 840  
 841  **Edge Cases Handled:**
 842  
 843  - Race conditions: Transaction ensures atomic claiming
 844  - Crashed instances: Stale lock cleanup after 2 minutes
 845  - Duplicate processing: Prevented by atomic UPDATE WHERE status='pending'
 846  
 847  ### Monitoring Concurrent Agents
 848  
 849  **View running instances:**
 850  
 851  ```bash
 852  # Check agent states
 853  npm run agent:list
 854  
 855  # Monitor task processing
 856  watch -n 5 'npm run agent:tasks'
 857  ```
 858  
 859  **Database query:**
 860  
 861  ```sql
 862  SELECT
 863    agent_name,
 864    COUNT(*) as processing_count,
 865    GROUP_CONCAT(id) as task_ids
 866  FROM agent_tasks
 867  WHERE status = 'running'
 868  GROUP BY agent_name;
 869  ```
 870  
 871  ### Limitations
 872  
 873  **SQLite Concurrency:**
 874  
 875  - WAL mode recommended for high concurrency
 876  - ~10 concurrent writers is safe limit
 877  - Consider PostgreSQL for >10 instances
 878  
 879  **Cost Considerations:**
 880  
 881  - Each instance makes LLM API calls
 882  - Budget enforcement: `AGENT_DAILY_BUDGET=10` (USD)
 883  - Emergency shutdown if >$5/hour spend rate
 884  
 885  ### Best Practices
 886  
 887  1. **Start with 2-3 instances** - Verify row-level locking works correctly
 888  2. **Monitor task completion** - Ensure no duplicate processing
 889  3. **Check database locks** - Avoid SQLite contention
 890  4. **Scale gradually** - Add instances as queue depth increases
 891  5. **Use priority wisely** - High-priority tasks processed first
 892  
 893  ### Testing Concurrent Locking
 894  
 895  ```bash
 896  # Run concurrent locking tests
 897  npm test tests/agents/concurrent-locking.test.js
 898  ```
 899  
 900  Tests verify:
 901  
 902  - No duplicate task processing
 903  - Correct priority ordering
 904  - Work distribution across instances
 905  - Agent isolation (developer vs QA)
 906  - Backwards compatibility with single instance
 907  
 908  ---
 909  
 910  ## CLI Commands
 911  
 912  ### View Agent Status
 913  
 914  ```bash
 915  # List all agents with current status
 916  npm run agent:list
 917  
 918  # Output:
 919  # Agent: monitor, Status: idle, Last run: 2026-02-15 10:25:00
 920  # Agent: developer, Status: running, Current task: 42
 921  # Circuit breaker: All agents operational
 922  ```
 923  
 924  ### Manage Tasks
 925  
 926  ```bash
 927  # View all pending tasks
 928  npm run agent:tasks
 929  
 930  # View tasks for specific agent
 931  npm run agent:tasks -- --assigned-to developer
 932  
 933  # View tasks by status
 934  npm run agent:tasks -- --status pending
 935  npm run agent:tasks -- --status awaiting_po_approval
 936  
 937  # View specific task details
 938  npm run agent:tasks -- --task-id 42
 939  ```
 940  
 941  ### Create Tasks Manually
 942  
 943  ```bash
 944  # Create task for developer
 945  npm run agent:create -- --agent developer --task fix_bug --context '{"error":"...","file":"src/scoring.js"}' --priority 7
 946  
 947  # Create task for QA
 948  npm run agent:create -- --agent qa --task write_test --context '{"module":"scoring","function":"calculateScore"}' --priority 5
 949  ```
 950  
 951  ### Trigger Workflows
 952  
 953  ```bash
 954  # Bug fix workflow
 955  npm run agent:workflow -- --workflow bug-fix --error "TypeError: Cannot read property 'score' of null" --stage scoring
 956  
 957  # Feature workflow
 958  npm run agent:workflow -- --workflow feature --description "Add export to CSV" --requirements '["Export button","CSV format","Download trigger"]'
 959  
 960  # Refactor workflow
 961  npm run agent:workflow -- --workflow refactor --file src/complex-module.js --reason "Cyclomatic complexity exceeds 15"
 962  ```
 963  
 964  ### Manage Approvals
 965  
 966  ```bash
 967  # View all pending approvals
 968  npm run agent:approvals
 969  
 970  # Filter by approval type
 971  npm run agent:approvals -- --status awaiting_po_approval
 972  npm run agent:approvals -- --status awaiting_architect_approval
 973  
 974  # Approve task
 975  npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision approved
 976  
 977  # Approve with conditions
 978  npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision approved_with_conditions --notes "Keep it simple" --conditions "Max 2 files,No new dependencies"
 979  
 980  # Reject task
 981  npm run agent:approve -- --task-id 42 --reviewer "Jason" --decision rejected --notes "Scope too large, break into smaller pieces"
 982  ```
 983  
 984  ### View Workflow Status
 985  
 986  ```bash
 987  # View workflow tree (parent/child tasks)
 988  npm run agent:workflow:status -- --workflow-id 42
 989  
 990  # Output shows task hierarchy and status
 991  ```
 992  
 993  ### View Logs
 994  
 995  ```bash
 996  # View all agent logs
 997  npm run agent:logs
 998  
 999  # Filter by agent
1000  npm run agent:logs -- --agent-name developer
1001  
1002  # Filter by task
1003  npm run agent:logs -- --task-id 42
1004  
1005  # Filter by level
1006  npm run agent:logs -- --level error
1007  npm run agent:logs -- --agent-name developer --level error
1008  ```
1009  
1010  ### View Statistics
1011  
1012  ```bash
1013  # View success rates and metrics
1014  npm run agent:stats
1015  
1016  # Output:
1017  # Agent: developer, Tasks: 45, Success: 42, Failure: 3, Rate: 93%
1018  # Agent: qa, Tasks: 38, Success: 38, Failure: 0, Rate: 100%
1019  # Circuit breaker: All agents operational
1020  ```
1021  
1022  ### Run Agents Manually
1023  
1024  ```bash
1025  # Run all agents once
1026  npm run agent:run
1027  
1028  # Run with verbose logging
1029  npm run agent:run -- --verbose
1030  
1031  # Process up to N tasks
1032  npm run agent:run -- --tasks=10
1033  
1034  # Run single agent
1035  npm run agent:run:single
1036  ```
1037  
1038  ---
1039  
1040  ## Configuration
1041  
1042  ### Environment Variables
1043  
1044  ```bash
1045  # Enable/disable agent system
1046  AGENT_SYSTEM_ENABLED=true
1047  
1048  # Circuit breaker threshold (30% failure rate triggers disable)
1049  AGENT_CIRCUIT_BREAKER_THRESHOLD=0.3
1050  
1051  # Rate limit (max invocations per hour)
1052  AGENT_MAX_INVOCATIONS_PER_HOUR=60
1053  
1054  # Immediate invocation (default: true)
1055  # Event-driven agent invocation eliminates 5-minute cron delays
1056  # Agents invoke each other immediately after handoffs and task creation
1057  # Speeds up workflows 10-15x (from 15-20 min to < 2 min)
1058  # See docs/IMMEDIATE-INVOCATION.md for details
1059  AGENT_IMMEDIATE_INVOCATION=true
1060  
1061  # Max chain depth (default: 10)
1062  # Prevents infinite loops by limiting consecutive immediate invocations
1063  # After reaching depth, agents fall back to cron polling
1064  AGENT_MAX_CHAIN_DEPTH=10
1065  
1066  # Database path (for testing)
1067  DATABASE_PATH=./db/sites.db
1068  ```
1069  
1070  ### Quality Gates
1071  
1072  **Developer Agent:**
1073  
1074  - **Coverage gate:** 85%+ required BEFORE commits (HARD BLOCK)
1075  - Automatic test generation attempted if coverage <85%
1076  - Escalates to Architect if auto-fix fails
1077  
1078  **QA Agent:**
1079  
1080  - **Coverage gate:** 80%+ required to approve tasks (HARD BLOCK)
1081  - Creates `write_missing_tests` task if coverage <80%
1082  - Blocks parent task until coverage improves
1083  
1084  **Other Gates:**
1085  
1086  - **Retry limit:** 3 retries per task before marking as failed
1087  - **Task TTL:** Tasks pending >1 hour escalate to human review
1088  - **Circuit breaker:** >30% failure rate disables agent
1089  
1090  ### Scheduling
1091  
1092  **Immediate Invocation** (Event-Driven):
1093  
1094  Agents are invoked immediately when:
1095  
1096  - Another agent hands off a task (`handoff()`)
1097  - A new task is created (`createTask()`)
1098  
1099  This eliminates 5-minute cron delays, speeding up workflows **10-15x** (from 15-20 minutes to < 2 minutes).
1100  
1101  See [IMMEDIATE-INVOCATION.md](IMMEDIATE-INVOCATION.md) for details.
1102  
1103  **Cron Fallback** (Scheduled Polling):
1104  
1105  Agents also run via cron job every 5 minutes as a safety net:
1106  
1107  ```sql
1108  -- cron_jobs table entry
1109  INSERT INTO cron_jobs (name, schedule, handler, enabled)
1110  VALUES ('agent-runner', '*/5 * * * *', 'node src/agents/runner.js', 1);
1111  ```
1112  
1113  Cron picks up tasks that were missed by immediate invocation (e.g., due to errors or depth limits).
1114  
1115  **Manual control:**
1116  
1117  - Start: Set `enabled = 1` in cron_jobs
1118  - Stop: Set `enabled = 0` in cron_jobs
1119  - One-time run: `npm run agent:run`
1120  
1121  ---
1122  
1123  ## Safety Features
1124  
1125  ### Circuit Breaker
1126  
1127  **Purpose:** Prevent runaway agent failures from consuming resources
1128  
1129  **How it works:**
1130  
1131  1. Monitors agent success/failure ratios
1132  2. If failure rate >30% (and ≥10 tasks completed): Trigger circuit breaker
1133  3. Agent status → `blocked`
1134  4. Timestamp recorded in `agent_state.metrics_json`
1135  5. Manual reset required
1136  
1137  **When triggered:**
1138  
1139  - Agent logged to `human_review_queue`
1140  - All tasks for that agent paused
1141  - Root cause investigation required
1142  
1143  **Reset:**
1144  
1145  ```sql
1146  UPDATE agent_state
1147  SET status = 'idle',
1148      metrics_json = json_remove(metrics_json, '$.circuit_breaker_triggered_at')
1149  WHERE agent_name = 'developer';
1150  ```
1151  
1152  ### Escalation to Human Review
1153  
1154  Tasks automatically escalate to `human_review_queue` for:
1155  
1156  - Database schema changes
1157  - Breaking API changes
1158  - Security-sensitive changes (auth, secrets, compliance)
1159  - Circuit breaker triggers
1160  - Stale tasks (pending >1 hour)
1161  - Failed tasks after 3 retries
1162  
1163  **Review queue:**
1164  
1165  ```bash
1166  # View human review items
1167  npm run agent:approvals
1168  
1169  # Approve/reject from queue
1170  npm run agent:approve -- --task-id <id> --reviewer "Name" --decision approved|rejected
1171  ```
1172  
1173  ### Audit Trail
1174  
1175  Complete tracking of all agent actions:
1176  
1177  **agent_logs table:**
1178  
1179  - Every task execution logged with level (info, warning, error)
1180  - Metadata includes context, decisions, file paths
1181  
1182  **agent_messages table:**
1183  
1184  - All inter-agent communication recorded
1185  - Message types: handoff, question, answer, notification
1186  
1187  **agent_tasks table:**
1188  
1189  - Task status changes tracked
1190  - Retry attempts logged
1191  - Result stored in result_json
1192  
1193  **agent_state table:**
1194  
1195  - Agent status changes
1196  - Metrics tracked (success/failure rates)
1197  - Last run timestamps
1198  
1199  ### Rollback Protection
1200  
1201  **Before making changes:**
1202  
1203  1. Developer agent checks coverage
1204  2. Architect reviews implementation plan
1205  3. QA verifies changes don't break tests
1206  
1207  **If something breaks:**
1208  
1209  1. Monitor detects errors in logs
1210  2. Triage classifies and routes
1211  3. Developer creates fix
1212  4. Workflow repeats with proper gates
1213  
1214  **Manual rollback:**
1215  
1216  ```bash
1217  # View recent changes
1218  git log --oneline -5
1219  
1220  # Rollback if needed
1221  git revert <commit-hash>
1222  
1223  # Trigger QA verification
1224  npm run agent:create -- --agent qa --task verify_fix --context '{"commit":"..."}' --priority 10
1225  ```
1226  
1227  ---
1228  
1229  ## Cost Management
1230  
1231  ### Token Usage Reduction
1232  
1233  **Monolithic approach:** 100-150KB per invocation (full CLAUDE.md)
1234  **Multi-agent approach:** 20-25KB per invocation (base + role context)
1235  **Reduction:** 75-85%
1236  
1237  ### Breakdown by Agent
1238  
1239  | Agent     | Context Size | Tokens/Invocation | Reduction |
1240  | --------- | ------------ | ----------------- | --------- |
1241  | Monitor   | 20KB         | ~5,000            | 80%       |
1242  | Triage    | 23.5KB       | ~6,000            | 76%       |
1243  | Developer | 21.3KB       | ~5,300            | 79%       |
1244  | QA        | 23KB         | ~5,800            | 77%       |
1245  | Security  | 21KB         | ~5,200            | 79%       |
1246  | Architect | 25KB         | ~6,200            | 75%       |
1247  
1248  ### Rate Limiting
1249  
1250  **Environment variable:**
1251  
1252  ```bash
1253  AGENT_MAX_INVOCATIONS_PER_HOUR=60
1254  ```
1255  
1256  **How it works:**
1257  
1258  - Tracks invocations per hour in `agent_state.metrics_json`
1259  - If limit exceeded: Agent status → `blocked`
1260  - Resets every hour
1261  
1262  **Monitoring:**
1263  
1264  ```bash
1265  # Check invocation counts
1266  npm run agent:stats
1267  
1268  # View recent logs
1269  npm run agent:logs -- --agent-name developer
1270  ```
1271  
1272  ### Budget Controls
1273  
1274  **Prevent cost overruns:**
1275  
1276  1. **Set rate limits:** `AGENT_MAX_INVOCATIONS_PER_HOUR=60`
1277  2. **Monitor stats:** `npm run agent:stats` daily
1278  3. **Review logs:** Check for unnecessary task creation
1279  4. **Optimize context:** Keep context files lean and focused
1280  5. **Use Haiku for simple tasks:** `AGENT_USE_HAIKU_FOR_SIMPLE_TASKS=true` (50-70% cost reduction)
1281  
1282  ### Smart Model Selection (Haiku vs Sonnet)
1283  
1284  **Cost optimization:** The agent system automatically selects the appropriate model based on task complexity.
1285  
1286  **Cost comparison:**
1287  
1288  | Model             | Input          | Output            | Use Case                              |
1289  | ----------------- | -------------- | ----------------- | ------------------------------------- |
1290  | Claude 3.5 Haiku  | $0.80/M        | $4.00/M           | Simple pattern-based tasks            |
1291  | Claude 3.5 Sonnet | $3.00/M        | $15.00/M          | Complex reasoning & code generation   |
1292  | **Cost Savings**  | **4x cheaper** | **3.75x cheaper** | **50-70% reduction for simple tasks** |
1293  
1294  **Haiku tasks (simple/pattern-based):**
1295  
1296  - **Triage:** Error classification via pattern matching
1297  - **Monitor:** Log scanning and anomaly detection
1298  - **Security:** Regex-based security checks (SQL injection, secrets, command injection patterns)
1299  - **QA:** Test file discovery and simple test generation
1300  
1301  **Sonnet tasks (complex reasoning):**
1302  
1303  - **Developer:** Bug fixing and code generation
1304  - **Architect:** Design reviews and architectural decisions
1305  - **Security:** Advanced threat modeling (STRIDE analysis)
1306  - **QA:** Coverage analysis and complex integration tests
1307  
1308  **Configuration:**
1309  
1310  ```bash
1311  # Enable Haiku optimization (default: true for 50-70% cost reduction)
1312  AGENT_USE_HAIKU_FOR_SIMPLE_TASKS=true
1313  ```
1314  
1315  **Override model selection:**
1316  
1317  ```javascript
1318  // Force Haiku
1319  const result = await classifyIssue(agentName, taskId, errorMessage, {
1320    model: 'claude-3-5-haiku-20241022',
1321  });
1322  
1323  // Force Sonnet
1324  const result = await analyzeCode(agentName, taskId, filePath, prompt, {
1325    model: 'claude-3-5-sonnet-20241022',
1326    complexity: 'complex',
1327  });
1328  ```
1329  
1330  **Track cost savings:**
1331  
1332  ```bash
1333  npm run agent:stats
1334  ```
1335  
1336  Output includes model breakdown:
1337  
1338  ```json
1339  {
1340    "modelBreakdown": {
1341      "haiku": {
1342        "calls": 150,
1343        "cost": 0.45,
1344        "avgCost": 0.003
1345      },
1346      "sonnet": {
1347        "calls": 50,
1348        "cost": 1.2,
1349        "avgCost": 0.024
1350      },
1351      "savings": "27.3" // Percent of total cost from Haiku
1352    }
1353  }
1354  ```
1355  
1356  **Expected savings:**
1357  
1358  - Monitor/Triage agents: 60-70% cost reduction (mostly Haiku)
1359  - Developer/Architect: 10-20% cost reduction (mostly Sonnet)
1360  - Security: 30-40% cost reduction (mix of simple checks and complex modeling)
1361  - QA: 40-50% cost reduction (test generation uses Haiku, coverage analysis uses Sonnet)
1362  
1363  5. **Use circuit breakers:** Prevent runaway failures
1364  
1365  **Cost estimation:**
1366  
1367  - Average task: ~6,000 tokens input + ~2,000 tokens output = 8,000 tokens
1368  - At 60 invocations/hour: ~480,000 tokens/hour
1369  - At $3/M tokens (Sonnet): ~$1.44/hour
1370  - Daily cost (24 hours): ~$35
1371  
1372  **Cost optimization tips:**
1373  
1374  - Reduce task creation frequency if logs are clean
1375  - Increase Monitor scan interval from 5 to 10 minutes
1376  - Disable agents not currently needed
1377  - Use smaller models for simple tasks (Haiku for classification)
1378  
1379  ---
1380  
1381  ## Troubleshooting
1382  
1383  ### Agent Not Processing Tasks
1384  
1385  **Symptoms:**
1386  
1387  - Tasks stuck in `pending` status
1388  - Agent status shows `blocked`
1389  - No recent logs for agent
1390  
1391  **Diagnosis:**
1392  
1393  ```bash
1394  # Check agent status
1395  npm run agent:list
1396  
1397  # Check circuit breaker
1398  npm run agent:stats
1399  
1400  # View error logs
1401  npm run agent:logs -- --agent-name developer --level error
1402  ```
1403  
1404  **Solutions:**
1405  
1406  1. **Circuit breaker triggered:**
1407  
1408  ```sql
1409  -- Check metrics
1410  SELECT metrics_json FROM agent_state WHERE agent_name = 'developer';
1411  
1412  -- Reset if safe
1413  UPDATE agent_state
1414  SET status = 'idle',
1415      metrics_json = json_remove(metrics_json, '$.circuit_breaker_triggered_at')
1416  WHERE agent_name = 'developer';
1417  ```
1418  
1419  2. **Rate limit exceeded:**
1420  
1421  ```bash
1422  # Wait for hourly reset, or increase limit
1423  AGENT_MAX_INVOCATIONS_PER_HOUR=120
1424  ```
1425  
1426  3. **Agent disabled:**
1427  
1428  ```sql
1429  -- Re-enable agent
1430  UPDATE agent_state SET status = 'idle' WHERE agent_name = 'developer';
1431  ```
1432  
1433  ### Tasks Stuck in Pending
1434  
1435  **Symptoms:**
1436  
1437  - Tasks created but never start
1438  - Task age >1 hour
1439  
1440  **Diagnosis:**
1441  
1442  ```bash
1443  # View pending tasks
1444  npm run agent:tasks -- --status pending
1445  
1446  # Check if agents are running
1447  npm run agent:list
1448  
1449  # Check task dependencies
1450  npm run agent:workflow:status -- --workflow-id 42
1451  ```
1452  
1453  **Solutions:**
1454  
1455  1. **Parent task incomplete:**
1456     - Tasks with `parent_task_id` won't start until parent completes
1457     - Check parent status: `npm run agent:tasks -- --task-id <parent_id>`
1458     - Complete or cancel parent task
1459  
1460  2. **Agent not running:**
1461     - Check cron job enabled: `SELECT * FROM cron_jobs WHERE name = 'agent-runner';`
1462     - Enable: `UPDATE cron_jobs SET enabled = 1 WHERE name = 'agent-runner';`
1463     - Manual run: `npm run agent:run`
1464  
1465  3. **Task priority too low:**
1466     - Increase priority: `UPDATE agent_tasks SET priority = 10 WHERE id = 42;`
1467  
1468  ### High Token Costs
1469  
1470  **Symptoms:**
1471  
1472  - Higher than expected API bills
1473  - Many agent invocations
1474  
1475  **Diagnosis:**
1476  
1477  ```bash
1478  # Check invocation counts
1479  SELECT agent_name, COUNT(*) as invocations
1480  FROM agent_logs
1481  WHERE created_at > datetime('now', '-1 hour')
1482  GROUP BY agent_name;
1483  
1484  # Check task creation rate
1485  SELECT task_type, COUNT(*) as count
1486  FROM agent_tasks
1487  WHERE created_at > datetime('now', '-24 hours')
1488  GROUP BY task_type;
1489  ```
1490  
1491  **Solutions:**
1492  
1493  1. **Reduce invocation frequency:**
1494  
1495  ```bash
1496  # Lower rate limit
1497  AGENT_MAX_INVOCATIONS_PER_HOUR=30
1498  
1499  # Increase Monitor scan interval
1500  # Edit cron_jobs: '*/5 * * * *' → '*/10 * * * *'
1501  ```
1502  
1503  2. **Optimize context:**
1504     - Review context files for unnecessary content
1505     - Remove duplicate information
1506     - Keep context files lean
1507  
1508  3. **Disable unnecessary agents:**
1509  
1510  ```sql
1511  -- Temporarily disable Security agent
1512  UPDATE agent_state SET status = 'disabled' WHERE agent_name = 'security';
1513  ```
1514  
1515  ### Circuit Breaker Triggered
1516  
1517  **Symptoms:**
1518  
1519  - Agent status = `blocked`
1520  - `circuit_breaker_triggered_at` in metrics_json
1521  
1522  **Diagnosis:**
1523  
1524  ```bash
1525  # View error logs
1526  npm run agent:logs -- --agent-name developer --level error
1527  
1528  # Check failure rate
1529  npm run agent:stats
1530  ```
1531  
1532  **Solutions:**
1533  
1534  1. **Identify root cause:**
1535     - Review error logs for patterns
1536     - Check recent code changes
1537     - Verify external dependencies (DB, APIs)
1538  
1539  2. **Fix underlying issue:**
1540     - If code bug: Fix and test
1541     - If external issue: Wait for resolution
1542     - If config issue: Update configuration
1543  
1544  3. **Reset circuit breaker:**
1545  
1546  ```sql
1547  -- Only after fixing root cause!
1548  UPDATE agent_state
1549  SET status = 'idle',
1550      metrics_json = json_remove(metrics_json, '$.circuit_breaker_triggered_at')
1551  WHERE agent_name = 'developer';
1552  ```
1553  
1554  ### Tasks Failing Repeatedly
1555  
1556  **Symptoms:**
1557  
1558  - Task retry_count = 3
1559  - Task status = `failed`
1560  - Same error in multiple tasks
1561  
1562  **Diagnosis:**
1563  
1564  ```bash
1565  # View failed tasks
1566  SELECT * FROM agent_tasks WHERE status = 'failed' ORDER BY created_at DESC LIMIT 10;
1567  
1568  # Check error patterns
1569  npm run agent:logs -- --level error
1570  ```
1571  
1572  **Solutions:**
1573  
1574  1. **Code issue:**
1575     - Manually fix the bug
1576     - Reset task: `UPDATE agent_tasks SET retry_count = 0, status = 'pending' WHERE id = 42;`
1577  
1578  2. **Missing dependencies:**
1579     - Install required packages: `npm install`
1580     - Update environment variables
1581  
1582  3. **Task too complex:**
1583     - Break into smaller subtasks
1584     - Provide more context in context_json
1585  
1586  ---
1587  
1588  ## Best Practices
1589  
1590  ### When to Use Agents
1591  
1592  **✅ Use agents for:**
1593  
1594  - Automated bug fixes from error logs
1595  - Test generation for new features
1596  - Security audits on commits
1597  - Documentation freshness checks
1598  - Refactoring suggestions
1599  - Routine maintenance tasks
1600  
1601  **❌ Don't use agents for:**
1602  
1603  - Quick one-off tasks (just do it manually)
1604  - Tasks requiring complex user input
1605  - Real-time user interactions
1606  - Tasks with high uncertainty (needs human judgment)
1607  - Exploratory work without clear goals
1608  
1609  ### Task Design
1610  
1611  **Be specific:**
1612  
1613  ```json
1614  {
1615    "error": "TypeError: Cannot read property 'score' of null",
1616    "file": "src/scoring.js",
1617    "line": 42,
1618    "stack": "..."
1619  }
1620  ```
1621  
1622  **Include context:**
1623  
1624  ```json
1625  {
1626    "files_changed": ["src/scoring.js", "src/utils/error-handler.js"],
1627    "related_issues": ["Issue #123"],
1628    "previous_attempts": ["Tried null check, still failing"]
1629  }
1630  ```
1631  
1632  **Set appropriate priority:**
1633  
1634  - 10: Critical (system down, security breach)
1635  - 7-9: High (blocking issue, major bug)
1636  - 4-6: Medium (normal bugs, features)
1637  - 1-3: Low (nice-to-haves, refactoring)
1638  
1639  **Link parent tasks:**
1640  
1641  ```javascript
1642  await createTask({
1643    task_type: 'verify_fix',
1644    assigned_to: 'qa',
1645    parent_task_id: 123, // Links to fix_bug task
1646    priority: 5,
1647  });
1648  ```
1649  
1650  ### Message Design
1651  
1652  **Use handoff for task completion:**
1653  
1654  ```javascript
1655  await agent.sendMessage(taskId, 'qa', 'handoff', 'Bug fix complete, ready for verification', {
1656    commit: 'abc123',
1657    files_changed: ['src/scoring.js'],
1658  });
1659  ```
1660  
1661  **Use questions for clarification:**
1662  
1663  ```javascript
1664  await agent.askQuestion(
1665    taskId,
1666    'developer',
1667    'Should this handle mobile and desktop screenshots differently?'
1668  );
1669  ```
1670  
1671  **Use notifications for FYI:**
1672  
1673  ```javascript
1674  await agent.sendMessage(
1675    taskId,
1676    'architect',
1677    'notification',
1678    'Coverage gate blocked commit due to <85% coverage',
1679    { current_coverage: 78, required: 85 }
1680  );
1681  ```
1682  
1683  ### Agent Development
1684  
1685  **Keep agents focused:**
1686  
1687  - Single responsibility principle
1688  - One agent = one clear role
1689  - Don't create "do everything" agents
1690  
1691  **Log liberally:**
1692  
1693  ```javascript
1694  await this.log(taskId, 'info', 'Starting bug fix analysis');
1695  await this.log(taskId, 'info', 'Identified affected files', { files: [...] });
1696  await this.log(taskId, 'info', 'Generated fix, checking coverage');
1697  ```
1698  
1699  **Fail gracefully:**
1700  
1701  ```javascript
1702  try {
1703    const result = await this.analyzeBug(task);
1704    return result;
1705  } catch (error) {
1706    await this.log(task.id, 'error', 'Bug analysis failed', { error: error.message });
1707    await this.failTask(task.id, { reason: 'Analysis failed', error: error.message });
1708    return null; // Return partial results if possible
1709  }
1710  ```
1711  
1712  **Validate inputs:**
1713  
1714  ```javascript
1715  async processTask(task) {
1716    const { context_json } = task;
1717    const context = JSON.parse(context_json);
1718  
1719    // Validate required fields
1720    if (!context.error || !context.file) {
1721      await this.failTask(task.id, { reason: 'Missing required context fields' });
1722      return;
1723    }
1724  
1725    // Continue processing...
1726  }
1727  ```
1728  
1729  **Test thoroughly:**
1730  
1731  - Unit tests for agent logic
1732  - Integration tests for workflows
1733  - Test error handling paths
1734  - Verify circuit breaker behavior
1735  
1736  ### Monitoring and Maintenance
1737  
1738  **Daily checks:**
1739  
1740  ```bash
1741  # Check agent health
1742  npm run agent:stats
1743  
1744  # Review errors
1745  npm run agent:logs -- --level error
1746  
1747  # Check approval queue
1748  npm run agent:approvals
1749  ```
1750  
1751  **Weekly reviews:**
1752  
1753  - Review circuit breaker triggers (if any)
1754  - Analyze token usage trends
1755  - Check task completion rates
1756  - Review escalated items
1757  
1758  **Monthly optimization:**
1759  
1760  - Analyze agent effectiveness
1761  - Optimize context files
1762  - Update agent logic based on patterns
1763  - Review and update approval thresholds
1764  
1765  ---
1766  
1767  ## Additional Resources
1768  
1769  - **Agent System Architecture:** `/home/jason/code/333Method/docs/06-automation/agent-system.md`
1770  - **Workflow System:** `/home/jason/code/333Method/docs/06-automation/agent-workflow.md`
1771  - **Base Agent Code:** `/home/jason/code/333Method/src/agents/base-agent.js`
1772  - **Agent Implementations:** `/home/jason/code/333Method/src/agents/`
1773  - **Context Files:** `/home/jason/code/333Method/src/agents/contexts/`
1774  - **CLI Manager:** `/home/jason/code/333Method/src/cli/agent-manager.js`
1775  - **Database Schema:** `/home/jason/code/333Method/db/schema.sql`
1776  - **Migrations:** `/home/jason/code/333Method/db/migrations/041-create-agent-system.sql`
1777  
1778  ---
1779  
1780  **Last Updated:** 2026-02-15
1781  **Version:** 1.0
1782  **Status:** Production-ready