/ docs / 08-operations / logging.md
logging.md
  1  ---
  2  title: 'Logging'
  3  category: 'operations'
  4  last_verified: '2026-02-15'
  5  related_files:
  6    - 'src/cron/daily-log-rotation.js'
  7    - 'src/stages/keywords.js'
  8    - 'src/cli/keywords.js'
  9    - 'src/scrape.js'
 10    - 'src/capture.js'
 11  tags: ['logging', 'cron', 'scheduling', 'testing', 'security', 'database', 'api', 'ai']
 12  status: 'current'
 13  ---
 14  
 15  # Log Files Documentation
 16  
 17  This document provides a comprehensive reference for all log files in the 333 Method project.
 18  
 19  ## Overview
 20  
 21  All logs are stored in the `./logs/` directory with daily rotation and automatic cleanup.
 22  
 23  **Key Features:**
 24  
 25  - **Naming Convention**: `<script-name>-YYYY-MM-DD.log`
 26  - **Rotation Policy**: Daily rotation with 7-day retention by default
 27  - **Format**: Timestamp, script name, log level, and message
 28  - **Color Codes**: ANSI color codes automatically stripped for clean file output
 29  
 30  ## Log Rotation Commands
 31  
 32  ```bash
 33  # Manual log rotation (delete files older than 7 days)
 34  npm run logs:rotate
 35  
 36  # Preview what would be deleted (dry run)
 37  npm run logs:rotate:dry-run
 38  
 39  # Rotate with 30-day retention
 40  npm run logs:rotate:30d
 41  
 42  # View real-time logs
 43  tail -f logs/<script-name>-2026-02-08.log
 44  ```
 45  
 46  **Automatic Rotation:**
 47  Add `src/cron/daily-log-rotation.js` to crontab for daily cleanup at 2 AM:
 48  
 49  ```cron
 50  0 2 * * * cd /path/to/333Method && node src/cron/daily-log-rotation.js
 51  ```
 52  
 53  ## Log Files by Category
 54  
 55  ### Dashboard Logs
 56  
 57  #### Overview.log
 58  
 59  **Purpose**: Main dashboard application log
 60  **Source**: `dashboard/Overview.py`
 61  **Contains**: Dashboard startup, page navigation, errors, user interactions
 62  **Legacy Name**: Previously `app-2026-02-08.log` (now deprecated)
 63  **Common Issues**:
 64  
 65  - Coverage server port conflicts (port 8503 already in use)
 66  - Database connection errors during startup (usually transient)
 67  - Streamlit configuration issues
 68  
 69  #### dashboard.log
 70  
 71  **Purpose**: General dashboard runtime log
 72  **Source**: `dashboard/run_with_logging.py`
 73  **Contains**: Python logging output, Streamlit server messages, application-level events
 74  
 75  #### dashboard.coverage.log
 76  
 77  **Purpose**: Code coverage page log
 78  **Source**: `dashboard/pages/5_📈_Code_Coverage.py`
 79  **Contains**: Coverage report loading, HTTP server startup, iframe rendering
 80  
 81  #### dashboard.cron_jobs.log
 82  
 83  **Purpose**: Cron Jobs page log
 84  **Source**: `dashboard/pages/7_⚙️_Cron_Jobs.py`
 85  **Contains**: Job status queries, enable/disable actions, execution log displays
 86  
 87  #### dashboard.pipeline_health.log
 88  
 89  **Purpose**: Pipeline page log (formerly Pipeline Health)
 90  **Source**: `dashboard/pages/1_🔧_Pipeline.py`
 91  **Contains**: Pipeline funnel queries, error breakdowns, stuck sites analysis
 92  
 93  #### dashboard.system_health.log
 94  
 95  **Purpose**: System Health page log
 96  **Source**: `dashboard/pages/6_🖥️_System_Health.py`
 97  **Contains**: Database health checks, HTTP error tracking, API rate limit monitoring
 98  
 99  ### Pipeline Stage Logs
100  
101  #### Keywords.log
102  
103  **Purpose**: Keyword selection and prioritization stage
104  **Source**: `src/stages/keywords.js` or `src/cli/keywords.js`
105  **Contains**: Keyword selection, priority assignment, search volume validation
106  
107  #### Scraper.log
108  
109  **Purpose**: SERP scraping with ZenRows
110  **Source**: `src/scrape.js`
111  **Contains**: ZenRows API calls, SERP parsing, domain extraction, rate limiting
112  
113  #### Capture.log
114  
115  **Purpose**: Screenshot capture with Playwright
116  **Source**: `src/capture.js`
117  **Contains**: Browser automation, screenshot saves, DOM cropping, stealth mode actions
118  
119  #### scoring.log / Scoring.log / ScoringCLI.log
120  
121  **Purpose**: Initial AI scoring with GPT-4o-mini
122  **Source**: `src/score.js`, `src/stages/scoring.js`, `src/cli/scoring.js`
123  **Contains**: Vision API calls, score calculations, grade assignments, LLM token usage
124  
125  #### rescoring.log / Rescoring.log / RescoringCLI.log
126  
127  **Purpose**: Rescore B- and below with below-fold screenshots
128  **Source**: `src/stages/rescoring.js`, `src/cli/rescoring.js`
129  **Contains**: Below-fold captures, rescore decisions, grade improvements
130  
131  #### enrich.log / Enrich.log
132  
133  **Purpose**: Contact information enrichment
134  **Source**: `src/stages/enrich.js`, `src/cli/enrich.js`
135  **Contains**: Page browsing, contact extraction, phone/email/form detection
136  
137  #### Proposals.log / ProposalGeneratorV2.log
138  
139  **Purpose**: AI-generated proposal creation
140  **Source**: `src/stages/proposals.js`, `src/proposal-generator-v2.js`
141  **Contains**: Contact prioritization, proposal generation, personalization
142  
143  #### Outreach.log
144  
145  **Purpose**: Multi-channel outreach delivery
146  **Source**: `src/stages/outreach.js`, `src/cli/outreach.js`
147  **Contains**: Channel selection, delivery attempts, success/failure tracking
148  
149  ### Outreach Channel Logs
150  
151  #### SMSOutreach.log
152  
153  **Purpose**: SMS delivery via Twilio
154  **Source**: `src/outreach/sms.js`
155  **Contains**: Twilio API calls, SMS sending, delivery status, opt-out handling
156  
157  #### EmailOutreach.log
158  
159  **Purpose**: Email delivery via Resend
160  **Source**: `src/outreach/email.js`
161  **Contains**: Resend API calls, email sending, bounce/complaint tracking
162  
163  #### FormOutreach.log
164  
165  **Purpose**: Contact form submissions via Playwright
166  **Source**: `src/outreach/form.js`
167  **Contains**: Form detection, field filling, submission attempts, CAPTCHA handling
168  
169  #### XOutreach.log
170  
171  **Purpose**: Twitter/X DM delivery
172  **Source**: `src/outreach/x.js`
173  **Contains**: X.com automation, DM sending, profile rotation, bot detection avoidance
174  
175  #### LinkedInOutreach.log
176  
177  **Purpose**: LinkedIn message delivery
178  **Source**: `src/outreach/linkedin.js`
179  **Contains**: LinkedIn automation, connection requests, messaging, profile rotation
180  
181  ### Inbound Processing Logs
182  
183  #### InboundSMS.log
184  
185  **Purpose**: Inbound SMS webhook processing
186  **Source**: `src/inbound/sms.js`
187  **Contains**: Twilio webhook events, SMS replies, conversation threading
188  
189  #### InboundEmail.log
190  
191  **Purpose**: Inbound email webhook processing
192  **Source**: `src/inbound/email.js`
193  **Contains**: Resend webhook events, email replies, bounce/complaint handling
194  
195  #### InboundProcessor.log
196  
197  **Purpose**: General inbound message processing
198  **Source**: `src/inbound/processor.js`
199  **Contains**: Message routing, sentiment analysis, conversation management
200  
201  ### Utility & Helper Logs
202  
203  #### ContactPrioritizer.log
204  
205  **Purpose**: Contact extraction and prioritization
206  **Source**: `src/contacts/prioritize.js`
207  **Contains**: URI parsing, contact deduplication, channel preference logic
208  
209  #### KeywordManager.log / KeywordCounters.log
210  
211  **Purpose**: Keyword management and statistics
212  **Source**: `src/utils/keyword-manager.js`
213  **Contains**: Keyword status updates, search counts, processing stats
214  
215  #### Dedupe.log / DedupeLocale.log
216  
217  **Purpose**: Domain deduplication
218  **Source**: `src/utils/dedupe-domains.js`, `src/utils/dedupe-locale-aware.js`
219  **Contains**: Duplicate detection, locale matching, cross-border filtering
220  
221  #### DOMCropAnalyzer.log
222  
223  **Purpose**: Intelligent screenshot cropping
224  **Source**: `src/utils/dom-crop-analyzer.js`
225  **Contains**: DOM analysis, crop boundary calculations, element detection
226  
227  #### ImageOptimizer.log
228  
229  **Purpose**: Screenshot optimization
230  **Source**: `src/utils/image-optimizer.js`
231  **Contains**: Image compression, format conversion, size reduction
232  
233  #### StealthBrowser.log
234  
235  **Purpose**: Bot detection avoidance
236  **Source**: `src/utils/stealth-browser.js`
237  **Contains**: Browser fingerprinting, human-like behaviors, Cloudflare bypassing
238  
239  #### ErrorHandler.log
240  
241  **Purpose**: Error handling and retry logic
242  **Source**: `src/utils/error-handler.js`
243  **Contains**: Retry attempts, exponential backoff, batch processing errors
244  
245  #### CircuitBreaker.log
246  
247  **Purpose**: Circuit breaker pattern for API calls
248  **Source**: `src/utils/circuit-breaker.js`
249  **Contains**: Circuit state changes, failure thresholds, API health monitoring
250  
251  #### TimezoneDetector.log
252  
253  **Purpose**: Timezone detection from IP addresses
254  **Source**: `src/utils/timezone-detector.js`
255  **Contains**: IP geolocation, timezone mapping, IANA timezone assignment
256  
257  ### Compliance & Sync Logs
258  
259  #### Compliance.log
260  
261  **Purpose**: CAN-SPAM and TCPA compliance
262  **Source**: `src/utils/compliance.js`
263  **Contains**: Opt-out processing, unsubscribe link generation, consent validation
264  
265  #### SyncEmailEvents.log
266  
267  **Purpose**: Email event syncing from Cloudflare R2
268  **Source**: `src/utils/sync-email-events.js`
269  **Contains**: R2 bucket polling, event downloads, database syncing
270  
271  #### SyncUnsubscribes.log
272  
273  **Purpose**: Unsubscribe list syncing
274  **Source**: `src/utils/sync-unsubscribes.js`
275  **Contains**: Unsubscribe list updates, email/SMS opt-out tracking
276  
277  ### Cron & Automation Logs
278  
279  #### cron.log / Cron.log
280  
281  **Purpose**: Cron job execution
282  **Source**: `src/cron.js`, `src/cron/`
283  **Contains**: Scheduled task execution, job success/failure, timing information
284  
285  #### cron-list.log
286  
287  **Purpose**: Cron job listing
288  **Source**: CLI command `npm run cron list`
289  **Contains**: Active cron jobs, schedules, last run times
290  
291  ### Backfill & Maintenance Logs
292  
293  #### BackfillScreenshots.log
294  
295  **Purpose**: Backfill missing screenshots
296  **Source**: `scripts/backfill-screenshots.js`
297  **Contains**: Missing screenshot detection, batch recapture, progress tracking
298  
299  #### maint-claude.log
300  
301  **Purpose**: Claude Code maintenance sessions
302  **Source**: Development and debugging sessions
303  **Contains**: Interactive development, bug fixes, feature additions
304  
305  ### Profile Management Logs
306  
307  #### profiles.log
308  
309  **Purpose**: Browser profile management (X & LinkedIn)
310  **Source**: `src/utils/profile-manager.js`
311  **Contains**: Profile creation, LRU rotation, session cookie management
312  
313  ### Testing & Development Logs
314  
315  #### test.log / test-bash-wrapper.log / test-context.log / test-logger.log / test_logging.log
316  
317  **Purpose**: Test execution and debugging
318  **Source**: Various test files in `tests/`
319  **Contains**: Test results, assertion failures, mock API responses, coverage reports
320  
321  #### Summary.log
322  
323  **Purpose**: Summary generation utilities
324  **Source**: `src/utils/summary-generator.js`
325  **Contains**: Pipeline statistics, daily summaries, progress reports
326  
327  #### 333.log
328  
329  **Purpose**: Legacy or miscellaneous 333 Method scripts
330  **Source**: Various unnamed scripts
331  **Contains**: Ad-hoc operations, one-off tasks
332  
333  ## Log File Statistics
334  
335  **Total Unique Log Types**: 57
336  **Total Log Files**: Varies daily (old files auto-deleted after 7 days)
337  **Average Daily Size**: ~10-20 MB (varies by pipeline activity)
338  **Storage Location**: `./logs/` (relative to project root)
339  
340  ## Common Log Patterns
341  
342  ### Success Pattern
343  
344  ```
345  2026-02-08T14:30:45.123Z [INFO] [Scoring] Successfully scored site example.com (Grade: B+, Score: 84)
346  ```
347  
348  ### Error Pattern
349  
350  ```
351  2026-02-08T14:35:12.456Z [ERROR] [EmailOutreach] Failed to send email to contact@example.com: Rate limit exceeded
352  ```
353  
354  ### Retry Pattern
355  
356  ```
357  2026-02-08T14:40:01.789Z [WARN] [Scraper] ZenRows API timeout, retrying (attempt 2/3)
358  ```
359  
360  ### Database Error Pattern
361  
362  ```
363  2026-02-08T15:00:00.000Z [ERROR] [database] unable to open database file
364  ```
365  
366  ## Troubleshooting Common Log Errors
367  
368  ### Coverage Server Port Conflict
369  
370  **Error**: `OSError: [Errno 98] Address already in use`
371  **Log File**: `Overview.log`
372  **Cause**: Previous dashboard instance's coverage server still running
373  **Solution**: Server now handles this gracefully by reusing existing server
374  
375  ### Database Connection Error
376  
377  **Error**: `sqlite3.OperationalError: unable to open database file`
378  **Log File**: `Overview.log`, `dashboard.log`
379  **Cause**: Transient race condition during Streamlit startup when multiple pages connect simultaneously
380  **Impact**: Dashboard continues to work despite error (uses cached connection)
381  **Solution**: Database path now uses absolute path for reliability
382  
383  ### Streamlit Cache Miss
384  
385  **Error**: `CachedStFunctionWarning: Cached function mutated its input arguments`
386  **Log File**: `dashboard.*.log`
387  **Cause**: Streamlit cache detecting DataFrame mutations
388  **Solution**: Use `.copy()` on DataFrames before mutating
389  
390  ### Stuck Cron Jobs
391  
392  **Error**: Jobs showing "Running..." status indefinitely
393  **Log File**: `cron.log`
394  **Cause**: Job killed/crashed without updating status
395  **Solution**: Run cleanup SQL:
396  
397  ```sql
398  DELETE FROM config WHERE key LIKE 'cron_%_running' AND datetime(updated_at, '+10 minutes') < datetime('now');
399  UPDATE cron_job_logs SET status = 'timeout' WHERE status = 'running' AND datetime(started_at, '+10 minutes') < datetime('now');
400  ```
401  
402  ## Log Monitoring Best Practices
403  
404  1. **Real-time Monitoring**: Use `tail -f logs/<script>-$(date +%Y-%m-%d).log` for active monitoring
405  2. **Error Detection**: `grep -i error logs/*.log` to find recent errors across all logs
406  3. **Performance Analysis**: Check duration_minutes in cron job logs for performance regression
407  4. **Disk Space**: Monitor `logs/` directory size, especially `dashboard.log` and `app.log` which can grow large
408  5. **Historical Analysis**: Review older logs before rotation for patterns and trends
409  
410  ## Related Documentation
411  
412  - [Cron Jobs](./CRON-JOBS.md) - Scheduled task configuration
413  - [Dashboard](./DASHBOARD.md) - Analytics dashboard usage
414  - [Security](./SECURITY.md) - Security scanning and audit logs
415  - [Testing](../README.md#testing) - Test execution and coverage logs
416  
417  ## Implementation Details
418  
419  **Logger Module**: `src/utils/logger.js`
420  **Log Rotator**: `src/utils/log-rotator.js`
421  **NPM Wrapper**: `scripts/npm-logger.js`
422  
423  All major npm scripts automatically wrap with `node scripts/npm-logger.js <name> <command>` to capture stdout/stderr.