Cradicle Explorer

/ tests / README-E2E.md
README-E2E.md
  1  # End-to-End Pipeline Test
  2  
  3  This test validates the complete pipeline flow for a single website through all processing stages, **including actual outreach sending**.
  4  
  5  ## Overview
  6  
  7  The E2E test performs the following:
  8  
  9  1. **Setup**: Creates an isolated test database and adds a keyword and site
 10  2. **Assets Stage**: Captures screenshots (desktop above/below + mobile above, cropped + uncropped)
 11  3. **Scoring Stage**: Performs initial AI scoring with GPT-4o-mini
 12  4. **Rescoring Stage**: Re-scores with below-fold screenshots if score ≤ 82 (B- or below)
 13  5. **Enrichment Stage**: Browses key pages to find additional contact information
 14  6. **Proposals Stage**: Generates 3 proposal variants if score ≤ 82
 15  7. **Outreach Stage**: **ACTUALLY SENDS** SMS, emails, and submits contact forms (⚠️ real outreach!)
 16  8. **Final Verification**: Validates outputs against expected values and confirms complete pipeline execution
 17  
 18  ## ⚠️ Important Warnings
 19  
 20  - **This test WILL send real SMS, emails, and submit contact forms** to your test site
 21  - Only run this test on websites **you own and control**
 22  - Ensure you have permission to receive test messages at the contact details on your test site
 23  - Test artifacts are **preserved for manual review** (not cleaned up automatically)
 24  - You will receive actual SMS/emails from this test - make sure you're ready for that!
 25  
 26  ## Configuration
 27  
 28  ### 1\. Set Test URL and Keyword
 29  
 30  Add these to your `.env` file:
 31  
 32  ```
 33  # End-to-End Pipeline Test Configuration
 34  TEST_E2E_URL=https://yourwebsite.com  # Must be YOUR website!
 35  TEST_E2E_KEYWORD=your test keyword
 36  ```
 37  
 38  **Critical**: Use one of your own websites where you control the contact information.
 39  
 40  ### 2\. Configure Expected Values (Optional)
 41  
 42  After running the test once, you can edit two optional files to enable validation on subsequent runs:
 43  
 44  **tests/expected-e2e.json** - Score, HTML, and grade expectations (`patterns` is for HTML, and `should_contain` is for text content):
 45  
 46  ```json
 47  {
 48    "expected": {
 49      "conversion_score_json": {
 50        "overall_calculation": {
 51          "conversion_score": 75,
 52          "grade": "B-"
 53        }
 54      },
 55      "score_range": {
 56        "min": 70,
 57        "max": 85
 58      },
 59      "expected_grade": "B-",
 60      "html_dom": {
 61        "min_length": 1000,
 62        "patterns": ["<!DOCTYPE html>", "<title>", "</html>"],
 63        "should_contain": ["expected keyword", "expected text"]
 64      }
 65    }
 66  }
 67  ```
 68  
 69  **tests/expected-e2e-contacts.json** - Expected contact information structure:
 70  
 71  ```json
 72  {
 73    "email_addresses": [
 74      {"email": "contact@example.com", "label": "Contact"}
 75    ],
 76    "phone_numbers": [
 77      {"number": "+1234567890", "label": "Main"}
 78    ],
 79    "primary_contact_form": {
 80      "form_action_url": "https://example.com/contact",
 81      "fields": { ... }
 82    },
 83    "social_profiles": ["https://linkedin.com/..."]
 84  }
 85  ```
 86  
 87  This enables regression testing - the test will validate future runs against your expected values.
 88  
 89  ## Running the Test
 90  
 91  ```
 92  npm run test:e2e
 93  ```
 94  
 95  The test will:
 96  
 97  - Create an isolated test database (`db/test-e2e.db`)
 98  - Create a test screenshot directory (`screenshots-test-e2e/`)
 99  - Run through all pipeline stages
100  - **Actually send outreach messages to your test site**
101  - Save actual outputs to `test-results-e2e.json`
102  - **PRESERVE all test artifacts** for manual review (no automatic cleanup)
103  
104  ## What the Test Validates
105  
106  ### Stage 0: Setup
107  
108  - ✅ Removes any leftovers from previous tests
109  - ✅ Keyword inserted into database
110  - ✅ Site inserted with correct status (`found`)
111  
112  ### Stage 1: Assets
113  
114  - ✅ Screenshots captured successfully
115  - ✅ 6 screenshot files created (3 cropped + 3 uncropped)
116  - ✅ Site status updated to `captured`
117  - ✅ `screenshot_path` field populated
118  
119  ### Stage 2: Scoring
120  
121  - ✅ Site scored with GPT-4o-mini
122  - ✅ Score is valid number (0-100)
123  - ✅ Grade assigned (A+, A, A-, B+, B, B-, C, D, E, F)
124  - ✅ Scoring reasoning captured
125  - ✅ Site status updated to `scored`
126  - ✅ Validates against expected score range (if configured)
127  
128  ### Stage 3: Rescoring (if score ≤ 82)
129  
130  - ✅ Site re-scored with below-fold screenshots
131  - ✅ New score and grade assigned
132  - ✅ `rescored_at` timestamp set
133  - ✅ Improvement tracked
134  
135  ### Stage 5: Enrichment
136  
137  - ✅ Key pages browsed for additional contact info
138  - ✅ Contact forms discovered (if not already found)
139  - ✅ Additional emails and phone numbers found
140  - ✅ Site status updated to `enriched`
141  - ✅ `enriched_at` timestamp set
142  - ✅ Skipped if contact form already exists
143  
144  ### Stage 6: Proposals (if score ≤ 82)
145  
146  - ✅ 3 proposal variants generated
147  - ✅ Each variant has subject and message
148  - ✅ Outreach records created with `pending` status
149  
150  ### Stage 7: Outreach ⚠️ REAL SENDING
151  
152  - ✅ Contact methods prioritized
153  - ✅ Outreach records updated with contact info
154  - ⚠️ **SMS messages SENT** (if phone found)
155  - ⚠️ **Emails SENT** (if email found)
156  - ⚠️ **Contact forms SUBMITTED** (if form found)
157  - ✅ Outreach status updated to `sent` or `failed`
158  - ✅ Delivery tracked in database
159  
160  ### Stage 8: Final Verification
161  
162  - ✅ Complete pipeline flow verified
163  - ✅ All expected data present in database
164  - ✅ Validates against expected values (if configured)
165  - ✅ Outputs saved to `test-results-e2e.json`
166  
167  ## Test Artifacts (Preserved)
168  
169  After the test completes, inspect the artifacts:
170  
171  ```bash
172  # View test database
173  sqlite3 db/test-e2e.db
174  
175  # Check all tables
176  sqlite> .tables
177  sqlite> SELECT * FROM sites;
178  sqlite> SELECT * FROM outreaches;
179  sqlite> SELECT * FROM keywords;
180  sqlite> .exit
181  
182  # View screenshots
183  ls -la screenshots-test-e2e/
184  
185  # View actual test outputs
186  cat test-results-e2e.json
187  ```
188  
189  All artifacts are preserved in:
190  
191  - `db/test-e2e.db` - Test database
192  - `screenshots-test-e2e/` - Screenshot files
193  - `test-results-e2e.json` - Actual outputs from the test run
194  
195  These files are gitignored but kept locally for your review.
196  
197  ## Debug Output
198  
199  The test logs detailed debug information at each stage:
200  
201  ```
202  [2026-01-27T10:30:45.123Z] STAGE 1: ASSETS
203  [2026-01-27T10:30:45.124Z] Starting screenshot capture...
204  [2026-01-27T10:30:52.456Z] Assets stage completed
205  {
206    "duration": "7332ms",
207    "stats": {
208      "processed": 1,
209      "succeeded": 1,
210      "failed": 0
211    }
212  }
213  [2026-01-27T10:30:52.457Z] ✅ STAGE 1 COMPLETE: Screenshots captured
214  ```
215  
216  ## Failure Handling
217  
218  The test **stops at the first failure**. If any stage fails:
219  
220  1. The error is logged with full details
221  2. Remaining stages are skipped
222  3. Test artifacts are still preserved
223  4. The test exits with failure status
224  
225  This fail-fast behavior makes debugging easier by showing exactly where the pipeline broke.
226  
227  ## Expected Duration
228  
229  Typical test execution time:
230  
231  - **Assets** (screenshot capture): 5-15 seconds
232  - **Scoring** (AI analysis): 10-30 seconds
233  - **Rescoring** (if triggered): 10-30 seconds
234  - **Enrichment** (browsing key pages): 10-30 seconds
235  - **Proposals** (if triggered): 20-60 seconds
236  - **Outreach** (actual sending): 5-15 seconds
237  - **Total**: 60-180 seconds (depending on site complexity and API response times)
238  
239  ## Troubleshooting
240  
241  ### "Missing API keys" error
242  
243  Ensure these are set in your `.env`:
244  
245  - `OPENROUTER_API_KEY` - Required for scoring
246  - `TWILIO_ACCOUNT_SID`, `TWILIO_AUTH_TOKEN`, `TWILIO_PHONE_NUMBER` - Required for SMS
247  - `RESEND_API_KEY` - Required for email
248  - `SENDER_NAME`, `SENDER_EMAIL`, `SENDER_PHONE` - Required for all outreach
249  
250  ### "Playwright browser not found"
251  
252  Install browsers:
253  
254  ```
255  npx playwright install chromium
256  ```
257  
258  ### "Screenshot capture failed"
259  
260  - Check that TEST_E2E_URL is accessible
261  - Ensure website doesn't block automated browsers
262  - Check for valid SSL certificate
263  
264  ### "Scoring failed"
265  
266  - Verify OPENROUTER_API_KEY is valid
267  - Check OpenRouter account has credits
268  - Ensure screenshots were captured successfully
269  
270  ### "Outreach failed - No contact info found"
271  
272  This is expected if your test site doesn't have visible contact information. The test will skip outreach for that channel. To test outreach:
273  
274  - Ensure your test site has a visible phone number (for SMS)
275  - Ensure your test site has a visible email (for email)
276  - Ensure your test site has a contact form (for form submission)
277  
278  ## Regression Testing Workflow
279  
280  **First run**: Run test without expected values
281  
282  **Review outputs**: Check `test-results-e2e.json`
283  
284  **Set expected values**: Edit test expectations
285  
286  - `tests/expected-e2e.json` - Set expected score ranges and HTML patterns
287  - `tests/expected-e2e-contacts.json` - Set expected contact information structure
288  
289  **Subsequent runs**: Test validates against expected values
290  
291  **Update expectations**: When you change the system behavior, update expected values
292  
293  ## Manual Cleanup
294  
295  Test artifacts are **not** automatically cleaned up. To clean them manually:
296  
297  ```
298  # Remove test database
299  rm db/test-e2e.db
300  
301  # Remove test screenshots
302  rm -rf screenshots-test-e2e/
303  
304  # Remove test results
305  rm test-results-e2e.json
306  ```
307  
308  Or all at once:
309  
310  ```
311  rm db/test-e2e.db && rm -rf screenshots-test-e2e/ && rm test-results-e2e.json
312  ```
313  
314  ## What This Test Does NOT Cover
315  
316  - ❌ SERP scraping (ZenRows API) - Use your own URL directly
317  - ❌ Inbound reply processing - Run separately with inbound tests
318  - ❌ Multi-site batch processing - This tests 1 site only
319  - ❌ Error retry logic - Tests happy path only
320  
321  For these scenarios, use:
322  
323  - Unit tests: `npm run test:unit`
324  - Integration tests: `npm run test:integration`
325  - Full test suite: `npm run test:all`
326  
327  ## Security Notes
328  
329  - Never commit `tests/expected-e2e.json` with real scoring data
330  - Never commit `tests/expected-e2e-contacts.json` with real contact information
331  - Never commit `test-results-e2e.json` (it contains actual data)
332  - All expected values and test results files are gitignored by default
333  - Test database is gitignored
334  - Screenshots are gitignored
335  
336  ## Tips for Best Results
337  
338  1. Use a simple test site with clear contact information
339  2. Ensure contact details are easily scrapable (visible on homepage)
340  3. Use your own phone/email for test site contact info
341  4. Check your phone/email after test to verify messages arrived
342  5. Run during off-peak hours to avoid rate limits
343  6. Keep test site stable (same content) for consistent results
344  
345  ```
346  npm run test:e2e
347  ```
348  
349  ```
350  cat test-results-e2e.json
351  ```
352  
353  ```
354  npm run test:e2e
355  ```