/ docs / 07-integrations / circuit-breaker.md
circuit-breaker.md
  1  ---
  2  title: 'Circuit Breaker'
  3  category: 'integrations'
  4  last_verified: '2026-02-15'
  5  related_files:
  6    - 'src/score.js'
  7    - 'src/scrape.js'
  8    - 'src/outreach/sms.js'
  9    - 'src/outreach/email.js'
 10    - 'src/utils/circuit-breaker.js'
 11  tags: ['circuit', 'breaker', 'testing', 'api', 'ai', 'llm', 'email', 'sms']
 12  status: 'current'
 13  ---
 14  
 15  # Circuit Breaker Implementation
 16  
 17  **Date**: 2026-01-27
 18  **Status**: ✅ Completed
 19  **Recommendation**: RECOMMENDATIONS.md #9
 20  
 21  ## Overview
 22  
 23  Implemented circuit breaker pattern using the `opossum` library to prevent cascading failures and excessive API costs during repeated failures. Circuit breakers are now in place for all external API calls:
 24  
 25  - **OpenRouter API** (AI scoring)
 26  - **ZenRows API** (SERP scraping)
 27  - **Twilio API** (SMS sending)
 28  - **Resend API** (Email sending)
 29  
 30  ## How It Works
 31  
 32  Circuit breakers have three states:
 33  
 34  1. **CLOSED** (Normal): All requests pass through normally
 35  2. **OPEN** (Failure): After too many failures, requests fail immediately without calling the API
 36  3. **HALF_OPEN** (Testing): After timeout period, allows limited requests to test if service recovered
 37  
 38  ## Configuration
 39  
 40  ### Default Settings
 41  
 42  ```javascript
 43  {
 44    timeout: 30000,                 // 30s request timeout
 45    errorThresholdPercentage: 50,   // Open at 50% failure rate
 46    resetTimeout: 60000,            // Wait 1 minute before testing recovery
 47    volumeThreshold: 5              // Need 5+ requests before circuit can open
 48  }
 49  ```
 50  
 51  ### API-Specific Settings
 52  
 53  - **OpenRouter**: 60s timeout, 2min reset (slow AI inference)
 54  - **ZenRows**: 180s timeout, 2min reset (very slow SERP scraping - ZenRows recommended minimum)
 55  - **Twilio**: 30s timeout, 1min reset (fast SMS API)
 56  - **Resend**: 30s timeout, 1min reset (fast email API)
 57  
 58  ## Usage
 59  
 60  ### Automatic Protection
 61  
 62  All API calls in the following modules are now automatically protected:
 63  
 64  - `src/score.js` - OpenRouter scoring calls
 65  - `src/scrape.js` - ZenRows SERP calls
 66  - `src/outreach/sms.js` - Twilio SMS calls
 67  - `src/outreach/email.js` - Resend email calls
 68  
 69  ### Manual Usage (if needed)
 70  
 71  ```javascript
 72  import { openRouterBreaker } from './utils/circuit-breaker.js';
 73  
 74  // Wrap any async function with circuit breaker
 75  const result = await openRouterBreaker.fire(async () => {
 76    return await apiCall();
 77  });
 78  ```
 79  
 80  ### Monitoring
 81  
 82  Check circuit breaker stats:
 83  
 84  ```javascript
 85  import { getBreakerStats, openRouterBreaker } from './utils/circuit-breaker.js';
 86  
 87  const stats = getBreakerStats(openRouterBreaker);
 88  console.log(stats);
 89  // {
 90  //   name: 'OpenRouter',
 91  //   state: 'CLOSED',
 92  //   fires: 100,
 93  //   successes: 95,
 94  //   failures: 5,
 95  //   rejects: 0,
 96  //   timeouts: 0,
 97  //   failureRate: '5.00%'
 98  // }
 99  ```
100  
101  ## Benefits
102  
103  1. **Cost Savings**: Prevents repeated expensive API calls during outages
104  2. **Fast Failure**: Fails immediately when service is down (no waiting for timeouts)
105  3. **Automatic Recovery**: Automatically tests and recovers when service is back
106  4. **Visibility**: Logs all circuit state changes for debugging
107  
108  ## Testing
109  
110  Circuit breaker behavior is tested in `tests/circuit-breaker.test.js`:
111  
112  ```bash
113  npm test -- tests/circuit-breaker.test.js
114  ```
115  
116  All tests passing:
117  
118  - ✅ Tracks successful requests
119  - ✅ Tracks failed requests
120  - ✅ Opens circuit after threshold
121  - ✅ Rejects requests when open
122  - ✅ Handles timeouts correctly
123  
124  ## Logging
125  
126  Circuit breaker events are logged with the following levels:
127  
128  - **ERROR**: Circuit opened (too many failures)
129  - **INFO**: Circuit half-open (testing recovery)
130  - **SUCCESS**: Circuit closed (recovered)
131  - **WARN**: Requests rejected (circuit is open)
132  
133  Example log output:
134  
135  ```
136  ⚠️  Circuit breaker OPENED for Twilio - too many failures, blocking requests
137  ❌ Request rejected by circuit breaker for Twilio - circuit is OPEN
138  🔄 Circuit breaker HALF-OPEN for Twilio - testing if service recovered
139  ✅ Circuit breaker CLOSED for Twilio - service recovered
140  ```
141  
142  ## Files Changed
143  
144  1. **Created**:
145     - `src/utils/circuit-breaker.js` - Main circuit breaker module
146     - `tests/circuit-breaker.test.js` - Test suite
147  
148  2. **Updated**:
149     - `src/score.js` - Added circuit breaker for OpenRouter
150     - `src/scrape.js` - Added circuit breaker for ZenRows
151     - `src/outreach/sms.js` - Added circuit breaker for Twilio
152     - `src/outreach/email.js` - Added circuit breaker for Resend
153     - `package.json` - Added opossum dependency
154     - `docs/RECOMMENDATIONS.md` - Marked #9 as completed
155  
156  ## Dependencies
157  
158  ```json
159  {
160    "opossum": "^8.1.4"
161  }
162  ```
163  
164  ## Future Enhancements
165  
166  Possible improvements:
167  
168  1. **Metrics Integration**: Send circuit breaker stats to monitoring system
169  2. **Dashboard**: Visualize circuit breaker states in real-time
170  3. **Dynamic Configuration**: Adjust thresholds based on time of day/load
171  4. **Fallback Strategies**: Define fallback behaviors when circuit is open
172  5. **Circuit Breaker Pool**: Share circuit state across multiple processes
173  
174  ## References
175  
176  - [opossum documentation](https://nodeshift.dev/opossum/)
177  - [Circuit Breaker Pattern](https://martinfowler.com/bliki/CircuitBreaker.html)
178  - RECOMMENDATIONS.md #11 (formerly #9)