circuit-breaker.md
1 --- 2 title: 'Circuit Breaker' 3 category: 'integrations' 4 last_verified: '2026-02-15' 5 related_files: 6 - 'src/score.js' 7 - 'src/scrape.js' 8 - 'src/outreach/sms.js' 9 - 'src/outreach/email.js' 10 - 'src/utils/circuit-breaker.js' 11 tags: ['circuit', 'breaker', 'testing', 'api', 'ai', 'llm', 'email', 'sms'] 12 status: 'current' 13 --- 14 15 # Circuit Breaker Implementation 16 17 **Date**: 2026-01-27 18 **Status**: ✅ Completed 19 **Recommendation**: RECOMMENDATIONS.md #9 20 21 ## Overview 22 23 Implemented circuit breaker pattern using the `opossum` library to prevent cascading failures and excessive API costs during repeated failures. Circuit breakers are now in place for all external API calls: 24 25 - **OpenRouter API** (AI scoring) 26 - **ZenRows API** (SERP scraping) 27 - **Twilio API** (SMS sending) 28 - **Resend API** (Email sending) 29 30 ## How It Works 31 32 Circuit breakers have three states: 33 34 1. **CLOSED** (Normal): All requests pass through normally 35 2. **OPEN** (Failure): After too many failures, requests fail immediately without calling the API 36 3. **HALF_OPEN** (Testing): After timeout period, allows limited requests to test if service recovered 37 38 ## Configuration 39 40 ### Default Settings 41 42 ```javascript 43 { 44 timeout: 30000, // 30s request timeout 45 errorThresholdPercentage: 50, // Open at 50% failure rate 46 resetTimeout: 60000, // Wait 1 minute before testing recovery 47 volumeThreshold: 5 // Need 5+ requests before circuit can open 48 } 49 ``` 50 51 ### API-Specific Settings 52 53 - **OpenRouter**: 60s timeout, 2min reset (slow AI inference) 54 - **ZenRows**: 180s timeout, 2min reset (very slow SERP scraping - ZenRows recommended minimum) 55 - **Twilio**: 30s timeout, 1min reset (fast SMS API) 56 - **Resend**: 30s timeout, 1min reset (fast email API) 57 58 ## Usage 59 60 ### Automatic Protection 61 62 All API calls in the following modules are now automatically protected: 63 64 - `src/score.js` - OpenRouter scoring calls 65 - `src/scrape.js` - ZenRows SERP calls 66 - `src/outreach/sms.js` - Twilio SMS calls 67 - `src/outreach/email.js` - Resend email calls 68 69 ### Manual Usage (if needed) 70 71 ```javascript 72 import { openRouterBreaker } from './utils/circuit-breaker.js'; 73 74 // Wrap any async function with circuit breaker 75 const result = await openRouterBreaker.fire(async () => { 76 return await apiCall(); 77 }); 78 ``` 79 80 ### Monitoring 81 82 Check circuit breaker stats: 83 84 ```javascript 85 import { getBreakerStats, openRouterBreaker } from './utils/circuit-breaker.js'; 86 87 const stats = getBreakerStats(openRouterBreaker); 88 console.log(stats); 89 // { 90 // name: 'OpenRouter', 91 // state: 'CLOSED', 92 // fires: 100, 93 // successes: 95, 94 // failures: 5, 95 // rejects: 0, 96 // timeouts: 0, 97 // failureRate: '5.00%' 98 // } 99 ``` 100 101 ## Benefits 102 103 1. **Cost Savings**: Prevents repeated expensive API calls during outages 104 2. **Fast Failure**: Fails immediately when service is down (no waiting for timeouts) 105 3. **Automatic Recovery**: Automatically tests and recovers when service is back 106 4. **Visibility**: Logs all circuit state changes for debugging 107 108 ## Testing 109 110 Circuit breaker behavior is tested in `tests/circuit-breaker.test.js`: 111 112 ```bash 113 npm test -- tests/circuit-breaker.test.js 114 ``` 115 116 All tests passing: 117 118 - ✅ Tracks successful requests 119 - ✅ Tracks failed requests 120 - ✅ Opens circuit after threshold 121 - ✅ Rejects requests when open 122 - ✅ Handles timeouts correctly 123 124 ## Logging 125 126 Circuit breaker events are logged with the following levels: 127 128 - **ERROR**: Circuit opened (too many failures) 129 - **INFO**: Circuit half-open (testing recovery) 130 - **SUCCESS**: Circuit closed (recovered) 131 - **WARN**: Requests rejected (circuit is open) 132 133 Example log output: 134 135 ``` 136 ⚠️ Circuit breaker OPENED for Twilio - too many failures, blocking requests 137 ❌ Request rejected by circuit breaker for Twilio - circuit is OPEN 138 🔄 Circuit breaker HALF-OPEN for Twilio - testing if service recovered 139 ✅ Circuit breaker CLOSED for Twilio - service recovered 140 ``` 141 142 ## Files Changed 143 144 1. **Created**: 145 - `src/utils/circuit-breaker.js` - Main circuit breaker module 146 - `tests/circuit-breaker.test.js` - Test suite 147 148 2. **Updated**: 149 - `src/score.js` - Added circuit breaker for OpenRouter 150 - `src/scrape.js` - Added circuit breaker for ZenRows 151 - `src/outreach/sms.js` - Added circuit breaker for Twilio 152 - `src/outreach/email.js` - Added circuit breaker for Resend 153 - `package.json` - Added opossum dependency 154 - `docs/RECOMMENDATIONS.md` - Marked #9 as completed 155 156 ## Dependencies 157 158 ```json 159 { 160 "opossum": "^8.1.4" 161 } 162 ``` 163 164 ## Future Enhancements 165 166 Possible improvements: 167 168 1. **Metrics Integration**: Send circuit breaker stats to monitoring system 169 2. **Dashboard**: Visualize circuit breaker states in real-time 170 3. **Dynamic Configuration**: Adjust thresholds based on time of day/load 171 4. **Fallback Strategies**: Define fallback behaviors when circuit is open 172 5. **Circuit Breaker Pool**: Share circuit state across multiple processes 173 174 ## References 175 176 - [opossum documentation](https://nodeshift.dev/opossum/) 177 - [Circuit Breaker Pattern](https://martinfowler.com/bliki/CircuitBreaker.html) 178 - RECOMMENDATIONS.md #11 (formerly #9)