/ scripts / credential-rotation-playbook.md
credential-rotation-playbook.md
  1  ---
  2  title: Credential Rotation Playbook
  3  category: security
  4  last_verified: 2026-03-18
  5  related_files:
  6    - .env.secrets.example
  7    - .env
  8    - docs/plans/distributed-agent-system.md
  9    - docs/plans/ironclaw-setup.md
 10  tags: [security, credentials, rotation, ops]
 11  status: active
 12  ---
 13  
 14  # Credential Rotation Playbook
 15  
 16  Solo-operator playbook for rotating all credentials used by the 333Method pipeline. Designed for a NixOS host + Docker (Claude Code) environment with SQLite and Cloudflare Workers.
 17  
 18  ## Secrets Storage Rules
 19  
 20  - **Never commit secrets to git.** `.env.secrets` is gitignored. If you suspect a secret was committed, treat it as compromised and rotate immediately.
 21  - **Production secrets** are managed via SOPS (`333Method-infra/secrets/production.yaml`). After rotating any credential, update SOPS as well.
 22  - **Backup secrets** in a password manager (Bitwarden, 1Password, KeePassXC). Store each credential with: service name, creation date, rotation date, and the dashboard URL for regeneration.
 23  - **Shared secrets** (used by both pipeline and web infra): `AUDITANDFIX_WORKER_SECRET`, `UNSUBSCRIBE_SECRET`, `RESEND_WEBHOOK_SECRET`. These require coordinated updates across `.env.secrets` AND Cloudflare Worker env vars / Hostinger `.htaccess`.
 24  - **2Step shares secrets** via `~/code/2Step/src/utils/load-env.js` loading from 333Method. Rotating Twilio or Resend keys affects 2Step too.
 25  
 26  ---
 27  
 28  ## Rotation Schedule
 29  
 30  | Credential                  | Rotation          | Trigger                              |
 31  | --------------------------- | ----------------- | ------------------------------------ |
 32  | `OPENROUTER_API_KEY`        | Quarterly         | Billing cycle or suspected leak      |
 33  | `ZENROWS_API_KEY`           | Annually          | Low risk (read-only scraping)        |
 34  | `TWILIO_AUTH_TOKEN`         | Quarterly         | Financial risk (SMS costs money)     |
 35  | `RESEND_API_KEY`            | Quarterly         | Reputation risk (email sending)      |
 36  | `RESEND_WEBHOOK_SECRET`     | Annually          | Low risk (inbound verification only) |
 37  | `PAYPAL_CLIENT_SECRET`      | Quarterly         | Financial risk (payment processing)  |
 38  | `GOOGLE_SHEETS_PRIVATE_KEY` | Annually          | Low risk (internal reporting)        |
 39  | `AUDITANDFIX_WORKER_SECRET` | Semi-annually     | Shared secret, coordinated rotation  |
 40  | `UNSUBSCRIBE_SECRET`        | Annually          | Low risk (HMAC signing)              |
 41  | `DATAFORSEO_PASSWORD`       | Annually          | Low risk                             |
 42  | `ZEROBOUNCE_API_KEY`        | Annually          | Low risk                             |
 43  | `FIXER_API_KEY`             | Annually          | Low risk (free tier)                 |
 44  | `PEXELS_API_KEY`            | Annually          | Low risk (2Step, image search)       |
 45  | SSH keys                    | Annually          | Or on any suspected host compromise  |
 46  | `NOPECHA_API_KEY`           | Never (free tier) | Only on compromise                   |
 47  
 48  **Calendar reminder:** Set a quarterly reminder (1st of Jan/Apr/Jul/Oct) to run through the quarterly rotations. Rehearse the full restore path at least once per quarter (see Testing section).
 49  
 50  ---
 51  
 52  ## Per-Service Rotation Steps
 53  
 54  ### OPENROUTER_API_KEY
 55  
 56  1. **Generate:** [openrouter.ai/keys](https://openrouter.ai/keys) -- create new key before revoking old one.
 57  2. **Update:** `~/.../333Method/.env.secrets` -- replace `OPENROUTER_API_KEY=...`
 58  3. **Verify:** `node -e "require('./src/utils/load-env')(); const k=process.env.OPENROUTER_API_KEY; console.log('Key starts with:', k?.slice(0,8));"` then run a single scoring pass: `SCORE_SITES_BATCH=1 node src/score-sites.js`
 59  4. **Revoke old key:** Back in the OpenRouter dashboard, delete the previous key.
 60  5. **Downtime risk:** None if you create-then-revoke. Pipeline stages will fail with 401 if the key is invalid.
 61  6. **SOPS:** Update `333Method-infra/secrets/production.yaml`.
 62  
 63  ### ZENROWS_API_KEY
 64  
 65  1. **Generate:** [app.zenrows.com](https://app.zenrows.com/) -- API Keys section.
 66  2. **Update:** `.env.secrets`.
 67  3. **Verify:** `SERP_BATCH=1 node src/serp-scraper.js` -- confirm one SERP fetch succeeds.
 68  4. **Revoke old key** in ZenRows dashboard.
 69  5. **Downtime risk:** None. SERP scraping can pause briefly.
 70  
 71  ### TWILIO_AUTH_TOKEN (and SID)
 72  
 73  1. **Generate:** [console.twilio.com](https://console.twilio.com/) -- Account > API keys & tokens > Auth tokens. You can create a secondary auth token before revoking the primary.
 74  2. **Update:** `.env.secrets` -- `TWILIO_ACCOUNT_SID` (rarely changes) and `TWILIO_AUTH_TOKEN`.
 75  3. **Also update:** Twilio test credentials if rotating those too.
 76  4. **Verify:** `node -e "require('./src/utils/load-env')(); const twilio = require('twilio'); const c = twilio(process.env.TWILIO_ACCOUNT_SID, process.env.TWILIO_AUTH_TOKEN); c.api.accounts(process.env.TWILIO_ACCOUNT_SID).fetch().then(a => console.log('OK:', a.friendlyName)).catch(e => console.error('FAIL:', e.message));"`
 77  5. **Revoke:** Promote new token to primary, revoke old.
 78  6. **Downtime risk:** Minimal. Twilio supports secondary tokens for zero-downtime rotation. Inbound SMS webhook verification will fail if the token is wrong -- check webhook logs.
 79  
 80  ### RESEND_API_KEY
 81  
 82  1. **Generate:** [resend.com/api-keys](https://resend.com/api-keys) -- create new key.
 83  2. **Update:** `.env.secrets` -- `RESEND_API_KEY`.
 84  3. **Verify:** Send a test email: `OUTREACH_DRY_RUN=true node -e "require('./src/utils/load-env')(); /* check key loads */"` or trigger one email send to your own address.
 85  4. **Revoke old key** in Resend dashboard.
 86  5. **Downtime risk:** Email sends will fail with 401 during the gap. Keep it short.
 87  
 88  ### RESEND_WEBHOOK_SECRET
 89  
 90  1. **Generate:** [resend.com/webhooks](https://resend.com/webhooks) -- edit webhook, regenerate signing secret.
 91  2. **Update:** `.env.secrets` -- `RESEND_WEBHOOK_SECRET=whsec_...`
 92  3. **Also update:** If using a Cloudflare Worker for webhook relay (`EMAIL_EVENTS_WORKER_URL`), update the worker's environment variable too: `cd workers/resend-webhook && wrangler secret put RESEND_WEBHOOK_SECRET`.
 93  4. **Verify:** Trigger a test event from Resend dashboard; check webhook logs for successful signature validation.
 94  5. **Downtime risk:** Inbound email events (bounces, complaints) will be rejected until both sides match. No impact on outbound sending.
 95  
 96  ### PAYPAL_CLIENT_ID / PAYPAL_CLIENT_SECRET
 97  
 98  1. **Generate:** [developer.paypal.com/dashboard/applications](https://developer.paypal.com/dashboard/applications/) -- create new REST API app (or regenerate secret on existing app).
 99  2. **Update:** `.env.secrets` -- both `PAYPAL_CLIENT_ID` and `PAYPAL_CLIENT_SECRET`.
100  3. **Also update:** Cloudflare Worker env vars if the worker uses these directly: `cd workers/paypal-webhook && wrangler secret put PAYPAL_CLIENT_SECRET`.
101  4. **Verify:** `node -e "require('./src/utils/load-env')(); const { getAccessToken } = require('./src/payment/paypal-client'); getAccessToken().then(t => console.log('OK: token length', t.length)).catch(e => console.error('FAIL:', e.message));"`
102  5. **Downtime risk:** Payment processing will fail during rotation. Do this during off-hours (Sydney late night). Webhook verification may also break if the worker secret is out of sync.
103  
104  ### GOOGLE_SHEETS_PRIVATE_KEY / CLIENT_EMAIL
105  
106  1. **Generate:** [console.cloud.google.com](https://console.cloud.google.com/) -- IAM & Admin > Service Accounts > your account > Keys > Add Key > JSON.
107  2. **Update:** `.env.secrets` -- extract `client_email` and `private_key` from the downloaded JSON. Preserve `\n` characters in the private key value.
108  3. **Verify:** `node -e "require('./src/utils/load-env')(); const { getAuthClient } = require('./src/utils/google-sheets'); getAuthClient().then(() => console.log('OK')).catch(e => console.error('FAIL:', e.message));"`
109  4. **Revoke:** Delete old key in GCP console.
110  5. **Downtime risk:** None. Sheets reporting is non-critical.
111  
112  ### AUDITANDFIX_WORKER_SECRET
113  
114  This is a shared secret between the pipeline (`reply-processor.js` POSTs to `api.php`) and the Hostinger PHP backend. Both sides must match.
115  
116  1. **Generate:** `openssl rand -hex 32`
117  2. **Update ALL locations:**
118     - `.env.secrets` -- `AUDITANDFIX_WORKER_SECRET=<new value>`
119     - Hostinger `.htaccess` -- `SetEnv AUDITANDFIX_WORKER_SECRET <new value>` (edit via Hostinger File Manager or SSH)
120     - Any Cloudflare Workers that use this secret: `wrangler secret put AUDITANDFIX_WORKER_SECRET`
121  3. **Verify:** Trigger a prefill store from the pipeline and confirm the PHP endpoint accepts it (check HTTP response code).
122  4. **Downtime risk:** Prefill short URLs (`/o/{site_id}`) will return 403 if the secrets are mismatched. Update Hostinger first, then pipeline, to minimize the gap.
123  
124  ### UNSUBSCRIBE_SECRET
125  
126  1. **Generate:** `openssl rand -hex 32`
127  2. **Update:** `.env.secrets` and the Cloudflare unsubscribe worker: `cd workers/unsubscribe && wrangler secret put UNSUBSCRIBE_SECRET`.
128  3. **Verify:** Generate an unsubscribe link and confirm it resolves correctly.
129  4. **Downtime risk:** Existing unsubscribe links in already-sent emails will break if the HMAC key changes. Consider this carefully -- you may want to support both old and new keys for a grace period, or accept that old links will fail.
130  
131  ### DATAFORSEO_LOGIN / DATAFORSEO_PASSWORD
132  
133  1. **Generate:** [app.dataforseo.com/](https://app.dataforseo.com/) -- Account Settings > API credentials.
134  2. **Update:** `.env.secrets`.
135  3. **Verify:** Run keyword validation: `node -e "require('./src/utils/load-env')(); /* test DataForSEO call */"`
136  4. **Downtime risk:** None. Keyword research can pause.
137  
138  ### SSH Keys
139  
140  1. **Generate:** `ssh-keygen -t ed25519 -C "jason@nixos-$(date +%Y%m%d)"` on the host.
141  2. **Update:** Add new public key to `~/.ssh/authorized_keys` on remote hosts and GitHub deploy keys.
142  3. **Verify:** `ssh -i ~/.ssh/new_key host` works.
143  4. **Revoke:** Remove old public key from `authorized_keys` and GitHub.
144  5. **Downtime risk:** Lock yourself out if you remove the old key before verifying the new one. Always test first.
145  
146  ---
147  
148  ## Emergency Rotation (Suspected Compromise)
149  
150  If you suspect any credential has leaked (committed to git, visible in logs, unauthorized API usage), follow this order. Rotate highest blast-radius credentials first.
151  
152  ### Priority order:
153  
154  1. **PAYPAL_CLIENT_SECRET** -- financial loss. Revoke immediately in PayPal dashboard, then regenerate.
155  2. **TWILIO_AUTH_TOKEN** -- financial loss (SMS charges). Revoke in Twilio console.
156  3. **SSH keys** -- full host access. Remove compromised public key from all `authorized_keys` files.
157  4. **OPENROUTER_API_KEY** -- API billing. Revoke in OpenRouter dashboard.
158  5. **RESEND_API_KEY** -- domain reputation damage (spam). Revoke in Resend dashboard.
159  6. **AUDITANDFIX_WORKER_SECRET** -- could allow unauthorized prefill injection. Rotate on Hostinger first.
160  7. **Everything else** -- rotate in any order.
161  
162  ### Emergency checklist:
163  
164  - [ ] Identify which credential(s) were exposed and how
165  - [ ] Revoke the exposed credential immediately (don't wait to generate the replacement)
166  - [ ] Check service dashboards for unauthorized usage (API call logs, billing spikes)
167  - [ ] Generate replacement credential
168  - [ ] Update `.env.secrets` on the pipeline host
169  - [ ] Update SOPS (`333Method-infra/secrets/production.yaml`)
170  - [ ] Update any Cloudflare Workers that use the credential (`wrangler secret put ...`)
171  - [ ] Update Hostinger if the credential is a shared secret
172  - [ ] Restart the pipeline service: `systemctl --user restart 333method-pipeline`
173  - [ ] Verify pipeline is healthy: `bash scripts/monitoring-checks.sh`
174  - [ ] If the leak was a git commit: rewrite history with `git filter-repo` or `BFG`, force-push, and treat every credential in that file as compromised
175  - [ ] Document the incident: what leaked, when, what was rotated, any evidence of misuse
176  
177  ---
178  
179  ## Testing the Restore Path
180  
181  Run this quarterly to verify you can rotate credentials without breaking things. Do this during a maintenance window (no active outreach sends).
182  
183  ### Quick verification (15 min)
184  
185  For each credential you rotated, run its verify step from the per-service section above. At minimum, test these critical paths:
186  
187  - [ ] LLM calls work: `SCORE_SITES_BATCH=1 node src/score-sites.js` (uses OpenRouter or Anthropic)
188  - [ ] Email works: send a test email to yourself via Resend
189  - [ ] SMS works: send a test SMS to yourself via Twilio
190  - [ ] Sheets work: `node -e "require('./src/utils/load-env')(); require('./src/utils/google-sheets').getAuthClient().then(() => console.log('OK'))"`
191  - [ ] Pipeline overall: `bash scripts/monitoring-checks.sh` -- no errors
192  
193  ### Canary approach
194  
195  If you want to test a new key before cutting over:
196  
197  1. Set the new key in a separate env var (e.g., `OPENROUTER_API_KEY_NEW=sk-or-...`)
198  2. Manually test with: `OPENROUTER_API_KEY=$OPENROUTER_API_KEY_NEW node -e "..."`
199  3. Once verified, swap into the real variable and restart the pipeline.
200  
201  ### Rollback
202  
203  If a new credential breaks something:
204  
205  1. Restore the old key from your password manager backup.
206  2. Update `.env.secrets`.
207  3. Restart: `systemctl --user restart 333method-pipeline`
208  4. Investigate why the new key failed before trying again.
209  
210  There is no staging environment -- this is a solo-operator stack. The canary approach above is the closest equivalent. Keep the old key in your password manager for at least 24 hours after rotation.
211  
212  ---
213  
214  ## IRONCLAW Isolation (TODO)
215  
216  Per the distributed-agent-system plan, the following isolation measure is planned but not yet implemented:
217  
218  - **Create `IRONCLAW_OPENROUTER_API_KEY`** with a dedicated OpenRouter sub-account.
219  - Purpose: if the IronClaw agent framework is compromised, the attacker gets a key with its own rate limit and billing -- not the pipeline's main key.
220  - Implementation: add the key to `.env.secrets`, configure IronClaw to read `IRONCLAW_OPENROUTER_API_KEY` instead of `OPENROUTER_API_KEY`.
221  - Set a spending cap on the sub-account in OpenRouter dashboard.
222  - Add to the rotation schedule as a quarterly credential (same as main OpenRouter key).
223  
224  This is a blast-radius reduction measure. Until implemented, a single `OPENROUTER_API_KEY` is shared across all consumers.
225  
226  ---
227  
228  ## Automation Potential
229  
230  About **30–40% of the rotation workflow is automatable** — primarily detection and verification, not key generation itself (most service dashboards don't expose a rotation API).
231  
232  ### What can be automated now
233  
234  **Rotation reminder cron** — a `scripts/check-rotation-schedule.js` that reads a `credentials-metadata.json` tracking `{service, last_rotated, interval_days}` and fires a human-review queue entry (or Telegram alert via IronClaw) when any credential is overdue. Zero manual effort; just keeps a metadata file updated after each rotation.
235  
236  **Post-rotation verification** — a `scripts/verify-credentials.js` that makes a lightweight test call for each active API key and reports pass/fail. Run this after rotation to confirm nothing is broken before restarting the pipeline. Could also run on a weekly schedule to catch silently-expired keys.
237  
238  **Twilio key swap** — Twilio's API supports creating a secondary auth token, promoting it to primary, and revoking the old one programmatically. Full zero-downtime rotation scriptable with ~30 lines of Node.js.
239  
240  **sops re-encryption** — once you've updated `.env.secrets` manually, re-encrypting SOPS is a single command (`sops -e -i secrets/production.yaml`). Scriptable as part of a post-rotation hook.
241  
242  **Cloudflare Worker secret updates** — `wrangler secret put <KEY>` is CLI-driven, fully scriptable once you have the new value.
243  
244  ### What stays manual
245  
246  - **Key generation** for most services (OpenRouter, Anthropic, Resend, PayPal, Google Sheets, ZenRows) — no rotation API, requires dashboard interaction
247  - **Hostinger `.htaccess`** — no API; requires File Manager or SSH
248  - **SSH key rotation** — pushing new pubkeys to authorized hosts requires human verification
249  - **UNSUBSCRIBE_SECRET** — rotation breaks existing unsubscribe links in sent emails; human judgment required on timing
250  
251  ### Recommended first scripts to build
252  
253  When the orchestrator is stable, add these as monthly cron batch types:
254  
255  1. `scripts/check-rotation-schedule.js` — reads `credentials-metadata.json`, creates human-review items for overdue credentials
256  2. `scripts/verify-credentials.js` — test-calls each API, reports failures to human-review queue
257  
258  These give automated _detection_ while rotation stays manual. Both are ~50 lines each.
259  
260  ---
261  
262  ## Post-Rotation Restart Checklist
263  
264  After any credential rotation:
265  
266  - [ ] `.env.secrets` updated with new value
267  - [ ] SOPS updated (`333Method-infra/secrets/production.yaml`)
268  - [ ] Cloudflare Workers updated (if applicable): `wrangler secret put <KEY_NAME>`
269  - [ ] Hostinger updated (if applicable): `.htaccess` SetEnv
270  - [ ] Password manager updated with new value + rotation date
271  - [ ] Pipeline service restarted: `systemctl --user restart 333method-pipeline`
272  - [ ] `bash scripts/monitoring-checks.sh` passes
273  - [ ] Old key revoked in service dashboard (not before verifying new key works)
274  - [ ] 2Step checked (if rotating Twilio/Resend -- it loads from 333Method's env files)