logging.md
1 --- 2 title: 'Logging' 3 category: 'operations' 4 last_verified: '2026-02-15' 5 related_files: 6 - 'src/cron/daily-log-rotation.js' 7 - 'src/stages/keywords.js' 8 - 'src/cli/keywords.js' 9 - 'src/scrape.js' 10 - 'src/capture.js' 11 tags: ['logging', 'cron', 'scheduling', 'testing', 'security', 'database', 'api', 'ai'] 12 status: 'current' 13 --- 14 15 # Log Files Documentation 16 17 This document provides a comprehensive reference for all log files in the 333 Method project. 18 19 ## Overview 20 21 All logs are stored in the `./logs/` directory with daily rotation and automatic cleanup. 22 23 **Key Features:** 24 25 - **Naming Convention**: `<script-name>-YYYY-MM-DD.log` 26 - **Rotation Policy**: Daily rotation with 7-day retention by default 27 - **Format**: Timestamp, script name, log level, and message 28 - **Color Codes**: ANSI color codes automatically stripped for clean file output 29 30 ## Log Rotation Commands 31 32 ```bash 33 # Manual log rotation (delete files older than 7 days) 34 npm run logs:rotate 35 36 # Preview what would be deleted (dry run) 37 npm run logs:rotate:dry-run 38 39 # Rotate with 30-day retention 40 npm run logs:rotate:30d 41 42 # View real-time logs 43 tail -f logs/<script-name>-2026-02-08.log 44 ``` 45 46 **Automatic Rotation:** 47 Add `src/cron/daily-log-rotation.js` to crontab for daily cleanup at 2 AM: 48 49 ```cron 50 0 2 * * * cd /path/to/333Method && node src/cron/daily-log-rotation.js 51 ``` 52 53 ## Log Files by Category 54 55 ### Dashboard Logs 56 57 #### Overview.log 58 59 **Purpose**: Main dashboard application log 60 **Source**: `dashboard/Overview.py` 61 **Contains**: Dashboard startup, page navigation, errors, user interactions 62 **Legacy Name**: Previously `app-2026-02-08.log` (now deprecated) 63 **Common Issues**: 64 65 - Coverage server port conflicts (port 8503 already in use) 66 - Database connection errors during startup (usually transient) 67 - Streamlit configuration issues 68 69 #### dashboard.log 70 71 **Purpose**: General dashboard runtime log 72 **Source**: `dashboard/run_with_logging.py` 73 **Contains**: Python logging output, Streamlit server messages, application-level events 74 75 #### dashboard.coverage.log 76 77 **Purpose**: Code coverage page log 78 **Source**: `dashboard/pages/5_📈_Code_Coverage.py` 79 **Contains**: Coverage report loading, HTTP server startup, iframe rendering 80 81 #### dashboard.cron_jobs.log 82 83 **Purpose**: Cron Jobs page log 84 **Source**: `dashboard/pages/7_⚙️_Cron_Jobs.py` 85 **Contains**: Job status queries, enable/disable actions, execution log displays 86 87 #### dashboard.pipeline_health.log 88 89 **Purpose**: Pipeline page log (formerly Pipeline Health) 90 **Source**: `dashboard/pages/1_🔧_Pipeline.py` 91 **Contains**: Pipeline funnel queries, error breakdowns, stuck sites analysis 92 93 #### dashboard.system_health.log 94 95 **Purpose**: System Health page log 96 **Source**: `dashboard/pages/6_🖥️_System_Health.py` 97 **Contains**: Database health checks, HTTP error tracking, API rate limit monitoring 98 99 ### Pipeline Stage Logs 100 101 #### Keywords.log 102 103 **Purpose**: Keyword selection and prioritization stage 104 **Source**: `src/stages/keywords.js` or `src/cli/keywords.js` 105 **Contains**: Keyword selection, priority assignment, search volume validation 106 107 #### Scraper.log 108 109 **Purpose**: SERP scraping with ZenRows 110 **Source**: `src/scrape.js` 111 **Contains**: ZenRows API calls, SERP parsing, domain extraction, rate limiting 112 113 #### Capture.log 114 115 **Purpose**: Screenshot capture with Playwright 116 **Source**: `src/capture.js` 117 **Contains**: Browser automation, screenshot saves, DOM cropping, stealth mode actions 118 119 #### scoring.log / Scoring.log / ScoringCLI.log 120 121 **Purpose**: Initial AI scoring with GPT-4o-mini 122 **Source**: `src/score.js`, `src/stages/scoring.js`, `src/cli/scoring.js` 123 **Contains**: Vision API calls, score calculations, grade assignments, LLM token usage 124 125 #### rescoring.log / Rescoring.log / RescoringCLI.log 126 127 **Purpose**: Rescore B- and below with below-fold screenshots 128 **Source**: `src/stages/rescoring.js`, `src/cli/rescoring.js` 129 **Contains**: Below-fold captures, rescore decisions, grade improvements 130 131 #### enrich.log / Enrich.log 132 133 **Purpose**: Contact information enrichment 134 **Source**: `src/stages/enrich.js`, `src/cli/enrich.js` 135 **Contains**: Page browsing, contact extraction, phone/email/form detection 136 137 #### Proposals.log / ProposalGeneratorV2.log 138 139 **Purpose**: AI-generated proposal creation 140 **Source**: `src/stages/proposals.js`, `src/proposal-generator-v2.js` 141 **Contains**: Contact prioritization, proposal generation, personalization 142 143 #### Outreach.log 144 145 **Purpose**: Multi-channel outreach delivery 146 **Source**: `src/stages/outreach.js`, `src/cli/outreach.js` 147 **Contains**: Channel selection, delivery attempts, success/failure tracking 148 149 ### Outreach Channel Logs 150 151 #### SMSOutreach.log 152 153 **Purpose**: SMS delivery via Twilio 154 **Source**: `src/outreach/sms.js` 155 **Contains**: Twilio API calls, SMS sending, delivery status, opt-out handling 156 157 #### EmailOutreach.log 158 159 **Purpose**: Email delivery via Resend 160 **Source**: `src/outreach/email.js` 161 **Contains**: Resend API calls, email sending, bounce/complaint tracking 162 163 #### FormOutreach.log 164 165 **Purpose**: Contact form submissions via Playwright 166 **Source**: `src/outreach/form.js` 167 **Contains**: Form detection, field filling, submission attempts, CAPTCHA handling 168 169 #### XOutreach.log 170 171 **Purpose**: Twitter/X DM delivery 172 **Source**: `src/outreach/x.js` 173 **Contains**: X.com automation, DM sending, profile rotation, bot detection avoidance 174 175 #### LinkedInOutreach.log 176 177 **Purpose**: LinkedIn message delivery 178 **Source**: `src/outreach/linkedin.js` 179 **Contains**: LinkedIn automation, connection requests, messaging, profile rotation 180 181 ### Inbound Processing Logs 182 183 #### InboundSMS.log 184 185 **Purpose**: Inbound SMS webhook processing 186 **Source**: `src/inbound/sms.js` 187 **Contains**: Twilio webhook events, SMS replies, conversation threading 188 189 #### InboundEmail.log 190 191 **Purpose**: Inbound email webhook processing 192 **Source**: `src/inbound/email.js` 193 **Contains**: Resend webhook events, email replies, bounce/complaint handling 194 195 #### InboundProcessor.log 196 197 **Purpose**: General inbound message processing 198 **Source**: `src/inbound/processor.js` 199 **Contains**: Message routing, sentiment analysis, conversation management 200 201 ### Utility & Helper Logs 202 203 #### ContactPrioritizer.log 204 205 **Purpose**: Contact extraction and prioritization 206 **Source**: `src/contacts/prioritize.js` 207 **Contains**: URI parsing, contact deduplication, channel preference logic 208 209 #### KeywordManager.log / KeywordCounters.log 210 211 **Purpose**: Keyword management and statistics 212 **Source**: `src/utils/keyword-manager.js` 213 **Contains**: Keyword status updates, search counts, processing stats 214 215 #### Dedupe.log / DedupeLocale.log 216 217 **Purpose**: Domain deduplication 218 **Source**: `src/utils/dedupe-domains.js`, `src/utils/dedupe-locale-aware.js` 219 **Contains**: Duplicate detection, locale matching, cross-border filtering 220 221 #### DOMCropAnalyzer.log 222 223 **Purpose**: Intelligent screenshot cropping 224 **Source**: `src/utils/dom-crop-analyzer.js` 225 **Contains**: DOM analysis, crop boundary calculations, element detection 226 227 #### ImageOptimizer.log 228 229 **Purpose**: Screenshot optimization 230 **Source**: `src/utils/image-optimizer.js` 231 **Contains**: Image compression, format conversion, size reduction 232 233 #### StealthBrowser.log 234 235 **Purpose**: Bot detection avoidance 236 **Source**: `src/utils/stealth-browser.js` 237 **Contains**: Browser fingerprinting, human-like behaviors, Cloudflare bypassing 238 239 #### ErrorHandler.log 240 241 **Purpose**: Error handling and retry logic 242 **Source**: `src/utils/error-handler.js` 243 **Contains**: Retry attempts, exponential backoff, batch processing errors 244 245 #### CircuitBreaker.log 246 247 **Purpose**: Circuit breaker pattern for API calls 248 **Source**: `src/utils/circuit-breaker.js` 249 **Contains**: Circuit state changes, failure thresholds, API health monitoring 250 251 #### TimezoneDetector.log 252 253 **Purpose**: Timezone detection from IP addresses 254 **Source**: `src/utils/timezone-detector.js` 255 **Contains**: IP geolocation, timezone mapping, IANA timezone assignment 256 257 ### Compliance & Sync Logs 258 259 #### Compliance.log 260 261 **Purpose**: CAN-SPAM and TCPA compliance 262 **Source**: `src/utils/compliance.js` 263 **Contains**: Opt-out processing, unsubscribe link generation, consent validation 264 265 #### SyncEmailEvents.log 266 267 **Purpose**: Email event syncing from Cloudflare R2 268 **Source**: `src/utils/sync-email-events.js` 269 **Contains**: R2 bucket polling, event downloads, database syncing 270 271 #### SyncUnsubscribes.log 272 273 **Purpose**: Unsubscribe list syncing 274 **Source**: `src/utils/sync-unsubscribes.js` 275 **Contains**: Unsubscribe list updates, email/SMS opt-out tracking 276 277 ### Cron & Automation Logs 278 279 #### cron.log / Cron.log 280 281 **Purpose**: Cron job execution 282 **Source**: `src/cron.js`, `src/cron/` 283 **Contains**: Scheduled task execution, job success/failure, timing information 284 285 #### cron-list.log 286 287 **Purpose**: Cron job listing 288 **Source**: CLI command `npm run cron list` 289 **Contains**: Active cron jobs, schedules, last run times 290 291 ### Backfill & Maintenance Logs 292 293 #### BackfillScreenshots.log 294 295 **Purpose**: Backfill missing screenshots 296 **Source**: `scripts/backfill-screenshots.js` 297 **Contains**: Missing screenshot detection, batch recapture, progress tracking 298 299 #### maint-claude.log 300 301 **Purpose**: Claude Code maintenance sessions 302 **Source**: Development and debugging sessions 303 **Contains**: Interactive development, bug fixes, feature additions 304 305 ### Profile Management Logs 306 307 #### profiles.log 308 309 **Purpose**: Browser profile management (X & LinkedIn) 310 **Source**: `src/utils/profile-manager.js` 311 **Contains**: Profile creation, LRU rotation, session cookie management 312 313 ### Testing & Development Logs 314 315 #### test.log / test-bash-wrapper.log / test-context.log / test-logger.log / test_logging.log 316 317 **Purpose**: Test execution and debugging 318 **Source**: Various test files in `tests/` 319 **Contains**: Test results, assertion failures, mock API responses, coverage reports 320 321 #### Summary.log 322 323 **Purpose**: Summary generation utilities 324 **Source**: `src/utils/summary-generator.js` 325 **Contains**: Pipeline statistics, daily summaries, progress reports 326 327 #### 333.log 328 329 **Purpose**: Legacy or miscellaneous 333 Method scripts 330 **Source**: Various unnamed scripts 331 **Contains**: Ad-hoc operations, one-off tasks 332 333 ## Log File Statistics 334 335 **Total Unique Log Types**: 57 336 **Total Log Files**: Varies daily (old files auto-deleted after 7 days) 337 **Average Daily Size**: ~10-20 MB (varies by pipeline activity) 338 **Storage Location**: `./logs/` (relative to project root) 339 340 ## Common Log Patterns 341 342 ### Success Pattern 343 344 ``` 345 2026-02-08T14:30:45.123Z [INFO] [Scoring] Successfully scored site example.com (Grade: B+, Score: 84) 346 ``` 347 348 ### Error Pattern 349 350 ``` 351 2026-02-08T14:35:12.456Z [ERROR] [EmailOutreach] Failed to send email to contact@example.com: Rate limit exceeded 352 ``` 353 354 ### Retry Pattern 355 356 ``` 357 2026-02-08T14:40:01.789Z [WARN] [Scraper] ZenRows API timeout, retrying (attempt 2/3) 358 ``` 359 360 ### Database Error Pattern 361 362 ``` 363 2026-02-08T15:00:00.000Z [ERROR] [database] unable to open database file 364 ``` 365 366 ## Troubleshooting Common Log Errors 367 368 ### Coverage Server Port Conflict 369 370 **Error**: `OSError: [Errno 98] Address already in use` 371 **Log File**: `Overview.log` 372 **Cause**: Previous dashboard instance's coverage server still running 373 **Solution**: Server now handles this gracefully by reusing existing server 374 375 ### Database Connection Error 376 377 **Error**: `sqlite3.OperationalError: unable to open database file` 378 **Log File**: `Overview.log`, `dashboard.log` 379 **Cause**: Transient race condition during Streamlit startup when multiple pages connect simultaneously 380 **Impact**: Dashboard continues to work despite error (uses cached connection) 381 **Solution**: Database path now uses absolute path for reliability 382 383 ### Streamlit Cache Miss 384 385 **Error**: `CachedStFunctionWarning: Cached function mutated its input arguments` 386 **Log File**: `dashboard.*.log` 387 **Cause**: Streamlit cache detecting DataFrame mutations 388 **Solution**: Use `.copy()` on DataFrames before mutating 389 390 ### Stuck Cron Jobs 391 392 **Error**: Jobs showing "Running..." status indefinitely 393 **Log File**: `cron.log` 394 **Cause**: Job killed/crashed without updating status 395 **Solution**: Run cleanup SQL: 396 397 ```sql 398 DELETE FROM config WHERE key LIKE 'cron_%_running' AND datetime(updated_at, '+10 minutes') < datetime('now'); 399 UPDATE cron_job_logs SET status = 'timeout' WHERE status = 'running' AND datetime(started_at, '+10 minutes') < datetime('now'); 400 ``` 401 402 ## Log Monitoring Best Practices 403 404 1. **Real-time Monitoring**: Use `tail -f logs/<script>-$(date +%Y-%m-%d).log` for active monitoring 405 2. **Error Detection**: `grep -i error logs/*.log` to find recent errors across all logs 406 3. **Performance Analysis**: Check duration_minutes in cron job logs for performance regression 407 4. **Disk Space**: Monitor `logs/` directory size, especially `dashboard.log` and `app.log` which can grow large 408 5. **Historical Analysis**: Review older logs before rotation for patterns and trends 409 410 ## Related Documentation 411 412 - [Cron Jobs](./CRON-JOBS.md) - Scheduled task configuration 413 - [Dashboard](./DASHBOARD.md) - Analytics dashboard usage 414 - [Security](./SECURITY.md) - Security scanning and audit logs 415 - [Testing](../README.md#testing) - Test execution and coverage logs 416 417 ## Implementation Details 418 419 **Logger Module**: `src/utils/logger.js` 420 **Log Rotator**: `src/utils/log-rotator.js` 421 **NPM Wrapper**: `scripts/npm-logger.js` 422 423 All major npm scripts automatically wrap with `node scripts/npm-logger.js <name> <command>` to capture stdout/stderr.