/ docs / 06-automation / throttle-gates.md
throttle-gates.md
  1  # Throttle Gates — Stage Backlog Control
  2  
  3  ## Purpose
  4  
  5  The pipeline has many stages that produce work for the next stage. Without throttling,
  6  upstream stages (scoring, enrichment, proposals) burn CPU and tokens filling queues that
  7  are already large enough — delaying money-making stages (reword, outreach).
  8  
  9  The three gates ensure: **earlier stages only run when later stages need more work.**
 10  
 11  **Scope:** Gates apply to both 333Method and 2Step. `eligible_outreach` is the combined
 12  count from `msgs.messages` across both projects. Gate thresholds are the same.
 13  
 14  ---
 15  
 16  ## The Three Gates
 17  
 18  ### Gate 1 — Outreach queue
 19  
 20  **Measured:** `eligible_outreach` — approved email/sms messages across both projects, not yet sent
 21  (`delivery_status IS NULL`), in template countries, past the 3-day per-site cooldown, and
 22  `gdpr_verified=1` for GDPR countries.
 23  
 24  **Why cooldown is excluded:** Sites waiting out a cooldown don't need more reword work. Without
 25  this filter, a large batch of recently-sent AU/US sites keeps Gate 1 above threshold indefinitely,
 26  grinding reword/proposals to a halt even though the pipeline has nothing to send right now.
 27  
 28  **Threshold:** `STAGE_THROTTLE_MULTIPLIER × max(REWORD_EMAIL_BATCH, REWORD_SMS_BATCH)`
 29  Default: `3 × 50 = 150`
 30  
 31  **When fired (333Method):** pause `reword_*`, `proofread`, `proposals_*`, `enrich_sites`, `score_semantic`
 32  
 33  **When fired (2Step):** pause `2step:proposals_*`, `2step:video_scenes`, `2step:enrich`
 34  
 35  ---
 36  
 37  ### Gate 2 — Proposals drafted queue
 38  
 39  **Measured:** `actionable_proposals` — distinct sites at `proposals_drafted` status with at
 40  least one unreworded (`reworded_at IS NULL`), unsent (`sent_at IS NULL`), unfailed
 41  (`delivery_status IS NULL`) email/sms message in a template country (has `email.json`).
 42  
 43  **Threshold:** `STAGE_THROTTLE_MULTIPLIER × (PROPOSALS_EMAIL_BATCH + PROPOSALS_SMS_BATCH)`
 44  Default: `3 × 30 = 90`
 45  
 46  **When fired (333Method):** pause `proposals_*`, `enrich_sites`, `score_semantic`
 47  
 48  **When fired (2Step):** pause `2step:proposals_*`, `2step:enrich` (measured separately from 333Method:
 49  sites at `proposals_drafted` with ≥1 unsent approved message)
 50  
 51  ---
 52  
 53  ### Gate 3 — Enriched queue
 54  
 55  **Measured:** `actionable_enriched` — sites at `enriched` status, score below `LOW_SCORE_CUTOFF`
 56  (i.e. will receive proposals), in an active country.
 57  
 58  **Threshold:** `STAGE_THROTTLE_MULTIPLIER × ENRICH_SITES_BATCH`
 59  Default: `3 × 5 = 15`
 60  
 61  **When fired (333Method):** pause `enrich_sites`, `score_semantic`
 62  
 63  **When fired (2Step):** pause `2step:enrich` (sites at `enriched` waiting for video creation —
 64  measured separately: no score cutoff since 2Step enriches all sites regardless of score)
 65  
 66  ---
 67  
 68  ## Filter Inheritance
 69  
 70  Every gate measures only **actionable** items — filtered by the same constraints that
 71  would block the item later in the pipeline:
 72  
 73  - **Blocked countries** (`OUTREACH_BLOCKED_COUNTRIES`) excluded at every gate
 74  - **Skipped channels** (`OUTREACH_SKIP_METHODS`) excluded at Gates 1 and 2
 75  - **Score cutoff** (`LOW_SCORE_CUTOFF=82`) applied at 333Method Gates 2 and 3 — high-scoring
 76    sites skip proposals. **Not applied to 2Step** — all sites receive a video regardless of score.
 77  - **Failed delivery** (`delivery_status IS NOT NULL`) excluded at Gate 2 — already-failed
 78    messages can't be reworded without a manual reset
 79  - **Template countries** (`data/templates/{CC}/email.json` exists) applied at Gates 1 and 2 —
 80    no template = no outreach capability
 81  
 82  This means: if all approved outreach is for GDPR-blocked countries, Gate 1 stays at 0,
 83  and the system correctly continues generating proposals for active markets.
 84  
 85  ---
 86  
 87  ## Actable Queue Definitions (npm run status `actable` column)
 88  
 89  ### 333Method
 90  
 91  The `actable` column in `npm run status` shows how many items at each stage can actually
 92  flow to a sent outreach. Each stage applies progressively stricter filters:
 93  
 94  | Stage                                               | Actable definition                                                                                                                                                                                                                                                                                                   |
 95  | --------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
 96  | `semantic_scored` / `vision_scored` / `prog_scored` | In `ENGLISH_ONLY_MARKETS`, not in `OUTREACH_BLOCKED_COUNTRIES`                                                                                                                                                                                                                                                       |
 97  | `enriched_regex` / `enriched_llm`                   | Above + `score < LOW_SCORE_CUTOFF` (or unscored)                                                                                                                                                                                                                                                                     |
 98  | `enriched`                                          | Above + in template country (has `email.json`) — Gate 3                                                                                                                                                                                                                                                              |
 99  | `proposals_drafted`                                 | `enriched` filter + has unreworded+unsent+unfailed email/sms message in template country — Gate 2                                                                                                                                                                                                                    |
100  | `outreach` (approved queue)                         | `proposals_drafted` filter + `delivery_status IS NULL` + 3-day cooldown passed + `gdpr_verified=1` for GDPR countries — Gate 1. **Note:** SMS sends are also gated by business hours (8am–9pm in recipient timezone) — actable count may not send immediately if outside that window. Email has no time restriction. |
101  
102  **Why `eligibleCodes` (template countries) not just `activeCodes` (non-blocked):**
103  Gates 2 and 3 use template countries rather than all unblocked countries. A site in a
104  non-template country (e.g. JP, KR, MX) can never receive outreach — rewording or
105  proposing for those sites wastes tokens.
106  
107  **English-only filter for scoring stages:**
108  `ENGLISH_ONLY_MARKETS` restricts which sites are enriched. There is no point scoring
109  non-English sites (DE, FR, etc.) that will never progress to proposals. The actable
110  count for scoring stages reflects this upstream filter.
111  
112  ### 2Step
113  
114  2Step has no scoring stage. Pipeline flows: `found → reviews_downloaded → enriched → video_created → proposals_drafted → outreach_sent`.
115  
116  | Stage                | Actable definition                                                                     |
117  | -------------------- | -------------------------------------------------------------------------------------- |
118  | `reviews_downloaded` | Not in `OUTREACH_BLOCKED_COUNTRIES`, in template country                               |
119  | `enriched`           | Above — Gate 3 (all enriched sites get a video; no score cutoff)                       |
120  | `video_created`      | Above + video_url set + in template country                                            |
121  | `proposals_drafted`  | Above + has unsent+unfailed approved email/sms message in template country — Gate 2    |
122  | `outreach_sent`      | Above + `delivery_status IS NULL` + 72h cooldown passed — Gate 1 (shared queue)       |
123  
124  ---
125  
126  ## Normal Healthy State
127  
128  Gates 2 and 3 are usually firing. Gate 1 fires when outreach has built up a buffer.
129  
130  ```
131  Backlog: eligible_outreach=52 actionable_proposals=6371 actionable_enriched=41
132  proposals_email: SKIP — proposals_drafted=6371 > threshold=90
133  enrich_sites: SKIP — proposals_drafted=6371 > threshold=90
134  score_semantic: SKIP — proposals_drafted=6371 > threshold=90
135  reword_email: processing 10 items     ← reword runs, gates don't block it
136  reword_sms: processing 50 items
137  ```
138  
139  The pipeline drains toward sends. Only `reword_*`, `proofread`, `classify_replies`,
140  `extract_names`, `reply_responses`, and `oversee` run when all gates are firing.
141  
142  ---
143  
144  ## Configuration
145  
146  ```env
147  STAGE_THROTTLE_MULTIPLIER=3   # multiplier for all gate thresholds (default 3, consider 5)
148  REWORD_EMAIL_BATCH=10
149  REWORD_SMS_BATCH=50
150  PROPOSALS_EMAIL_BATCH=15
151  PROPOSALS_SMS_BATCH=15
152  ENRICH_SITES_BATCH=5
153  LOW_SCORE_CUTOFF=82
154  OUTREACH_BLOCKED_COUNTRIES=DE,FR,IT,ES,NL,BE,AT,SE,DK,NO,PL
155  OUTREACH_SKIP_METHODS=form,x,linkedin
156  ```
157  
158  Restart the orchestrator after changing `STAGE_THROTTLE_MULTIPLIER` — it's read once at startup.
159  
160  ---
161  
162  ## Monitoring
163  
164  Gate health is visible in two places:
165  
166  **Orchestrator log:**
167  
168  ```
169  grep "Backlog:\|SKIP —" logs/orchestrator-$(date +%Y-%m-%d).log | tail -20
170  ```
171  
172  **`npm run status`:** the `actable` column shows the actionable queue for each gated stage.
173  A green number well below threshold = gate open. A large number = gate likely firing.
174  
175  ---
176  
177  ## Serps Exception
178  
179  The `serps` stage is **exempt from all gates**. ZenRows is a paid daily quota — unused
180  credits are wasted. Serps runs regardless of downstream backlog.
181  
182  The 2Step `reviews` stage (Outscraper) is similarly exempt — API credits are pre-purchased
183  per keyword search, not per-use. Reviews run regardless of downstream backlog.