throttle-gates.md
1 # Throttle Gates — Stage Backlog Control 2 3 ## Purpose 4 5 The pipeline has many stages that produce work for the next stage. Without throttling, 6 upstream stages (scoring, enrichment, proposals) burn CPU and tokens filling queues that 7 are already large enough — delaying money-making stages (reword, outreach). 8 9 The three gates ensure: **earlier stages only run when later stages need more work.** 10 11 **Scope:** Gates apply to both 333Method and 2Step. `eligible_outreach` is the combined 12 count from `msgs.messages` across both projects. Gate thresholds are the same. 13 14 --- 15 16 ## The Three Gates 17 18 ### Gate 1 — Outreach queue 19 20 **Measured:** `eligible_outreach` — approved email/sms messages across both projects, not yet sent 21 (`delivery_status IS NULL`), in template countries, past the 3-day per-site cooldown, and 22 `gdpr_verified=1` for GDPR countries. 23 24 **Why cooldown is excluded:** Sites waiting out a cooldown don't need more reword work. Without 25 this filter, a large batch of recently-sent AU/US sites keeps Gate 1 above threshold indefinitely, 26 grinding reword/proposals to a halt even though the pipeline has nothing to send right now. 27 28 **Threshold:** `STAGE_THROTTLE_MULTIPLIER × max(REWORD_EMAIL_BATCH, REWORD_SMS_BATCH)` 29 Default: `3 × 50 = 150` 30 31 **When fired (333Method):** pause `reword_*`, `proofread`, `proposals_*`, `enrich_sites`, `score_semantic` 32 33 **When fired (2Step):** pause `2step:proposals_*`, `2step:video_scenes`, `2step:enrich` 34 35 --- 36 37 ### Gate 2 — Proposals drafted queue 38 39 **Measured:** `actionable_proposals` — distinct sites at `proposals_drafted` status with at 40 least one unreworded (`reworded_at IS NULL`), unsent (`sent_at IS NULL`), unfailed 41 (`delivery_status IS NULL`) email/sms message in a template country (has `email.json`). 42 43 **Threshold:** `STAGE_THROTTLE_MULTIPLIER × (PROPOSALS_EMAIL_BATCH + PROPOSALS_SMS_BATCH)` 44 Default: `3 × 30 = 90` 45 46 **When fired (333Method):** pause `proposals_*`, `enrich_sites`, `score_semantic` 47 48 **When fired (2Step):** pause `2step:proposals_*`, `2step:enrich` (measured separately from 333Method: 49 sites at `proposals_drafted` with ≥1 unsent approved message) 50 51 --- 52 53 ### Gate 3 — Enriched queue 54 55 **Measured:** `actionable_enriched` — sites at `enriched` status, score below `LOW_SCORE_CUTOFF` 56 (i.e. will receive proposals), in an active country. 57 58 **Threshold:** `STAGE_THROTTLE_MULTIPLIER × ENRICH_SITES_BATCH` 59 Default: `3 × 5 = 15` 60 61 **When fired (333Method):** pause `enrich_sites`, `score_semantic` 62 63 **When fired (2Step):** pause `2step:enrich` (sites at `enriched` waiting for video creation — 64 measured separately: no score cutoff since 2Step enriches all sites regardless of score) 65 66 --- 67 68 ## Filter Inheritance 69 70 Every gate measures only **actionable** items — filtered by the same constraints that 71 would block the item later in the pipeline: 72 73 - **Blocked countries** (`OUTREACH_BLOCKED_COUNTRIES`) excluded at every gate 74 - **Skipped channels** (`OUTREACH_SKIP_METHODS`) excluded at Gates 1 and 2 75 - **Score cutoff** (`LOW_SCORE_CUTOFF=82`) applied at 333Method Gates 2 and 3 — high-scoring 76 sites skip proposals. **Not applied to 2Step** — all sites receive a video regardless of score. 77 - **Failed delivery** (`delivery_status IS NOT NULL`) excluded at Gate 2 — already-failed 78 messages can't be reworded without a manual reset 79 - **Template countries** (`data/templates/{CC}/email.json` exists) applied at Gates 1 and 2 — 80 no template = no outreach capability 81 82 This means: if all approved outreach is for GDPR-blocked countries, Gate 1 stays at 0, 83 and the system correctly continues generating proposals for active markets. 84 85 --- 86 87 ## Actable Queue Definitions (npm run status `actable` column) 88 89 ### 333Method 90 91 The `actable` column in `npm run status` shows how many items at each stage can actually 92 flow to a sent outreach. Each stage applies progressively stricter filters: 93 94 | Stage | Actable definition | 95 | --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | 96 | `semantic_scored` / `vision_scored` / `prog_scored` | In `ENGLISH_ONLY_MARKETS`, not in `OUTREACH_BLOCKED_COUNTRIES` | 97 | `enriched_regex` / `enriched_llm` | Above + `score < LOW_SCORE_CUTOFF` (or unscored) | 98 | `enriched` | Above + in template country (has `email.json`) — Gate 3 | 99 | `proposals_drafted` | `enriched` filter + has unreworded+unsent+unfailed email/sms message in template country — Gate 2 | 100 | `outreach` (approved queue) | `proposals_drafted` filter + `delivery_status IS NULL` + 3-day cooldown passed + `gdpr_verified=1` for GDPR countries — Gate 1. **Note:** SMS sends are also gated by business hours (8am–9pm in recipient timezone) — actable count may not send immediately if outside that window. Email has no time restriction. | 101 102 **Why `eligibleCodes` (template countries) not just `activeCodes` (non-blocked):** 103 Gates 2 and 3 use template countries rather than all unblocked countries. A site in a 104 non-template country (e.g. JP, KR, MX) can never receive outreach — rewording or 105 proposing for those sites wastes tokens. 106 107 **English-only filter for scoring stages:** 108 `ENGLISH_ONLY_MARKETS` restricts which sites are enriched. There is no point scoring 109 non-English sites (DE, FR, etc.) that will never progress to proposals. The actable 110 count for scoring stages reflects this upstream filter. 111 112 ### 2Step 113 114 2Step has no scoring stage. Pipeline flows: `found → reviews_downloaded → enriched → video_created → proposals_drafted → outreach_sent`. 115 116 | Stage | Actable definition | 117 | -------------------- | -------------------------------------------------------------------------------------- | 118 | `reviews_downloaded` | Not in `OUTREACH_BLOCKED_COUNTRIES`, in template country | 119 | `enriched` | Above — Gate 3 (all enriched sites get a video; no score cutoff) | 120 | `video_created` | Above + video_url set + in template country | 121 | `proposals_drafted` | Above + has unsent+unfailed approved email/sms message in template country — Gate 2 | 122 | `outreach_sent` | Above + `delivery_status IS NULL` + 72h cooldown passed — Gate 1 (shared queue) | 123 124 --- 125 126 ## Normal Healthy State 127 128 Gates 2 and 3 are usually firing. Gate 1 fires when outreach has built up a buffer. 129 130 ``` 131 Backlog: eligible_outreach=52 actionable_proposals=6371 actionable_enriched=41 132 proposals_email: SKIP — proposals_drafted=6371 > threshold=90 133 enrich_sites: SKIP — proposals_drafted=6371 > threshold=90 134 score_semantic: SKIP — proposals_drafted=6371 > threshold=90 135 reword_email: processing 10 items ← reword runs, gates don't block it 136 reword_sms: processing 50 items 137 ``` 138 139 The pipeline drains toward sends. Only `reword_*`, `proofread`, `classify_replies`, 140 `extract_names`, `reply_responses`, and `oversee` run when all gates are firing. 141 142 --- 143 144 ## Configuration 145 146 ```env 147 STAGE_THROTTLE_MULTIPLIER=3 # multiplier for all gate thresholds (default 3, consider 5) 148 REWORD_EMAIL_BATCH=10 149 REWORD_SMS_BATCH=50 150 PROPOSALS_EMAIL_BATCH=15 151 PROPOSALS_SMS_BATCH=15 152 ENRICH_SITES_BATCH=5 153 LOW_SCORE_CUTOFF=82 154 OUTREACH_BLOCKED_COUNTRIES=DE,FR,IT,ES,NL,BE,AT,SE,DK,NO,PL 155 OUTREACH_SKIP_METHODS=form,x,linkedin 156 ``` 157 158 Restart the orchestrator after changing `STAGE_THROTTLE_MULTIPLIER` — it's read once at startup. 159 160 --- 161 162 ## Monitoring 163 164 Gate health is visible in two places: 165 166 **Orchestrator log:** 167 168 ``` 169 grep "Backlog:\|SKIP —" logs/orchestrator-$(date +%Y-%m-%d).log | tail -20 170 ``` 171 172 **`npm run status`:** the `actable` column shows the actionable queue for each gated stage. 173 A green number well below threshold = gate open. A large number = gate likely firing. 174 175 --- 176 177 ## Serps Exception 178 179 The `serps` stage is **exempt from all gates**. ZenRows is a paid daily quota — unused 180 credits are wasted. Serps runs regardless of downstream backlog. 181 182 The 2Step `reviews` stage (Outscraper) is similarly exempt — API credits are pre-purchased 183 per keyword search, not per-use. Reviews run regardless of downstream backlog.