Cradicle Explorer

/ docs / plans / ironclaw-setup.md
ironclaw-setup.md
  1  ---
  2  title: IronClaw Setup Plan
  3  category: plans
  4  last_verified: 2026-02-24
  5  related_files:
  6    - docs/plans/distributed-agent-system.md (Parts M, N, O)
  7    - 333Method-infra/modules/containers.nix
  8    - 333Method-infra/modules/secrets.nix
  9  tags: [ironclaw, security, owasp-agentic, wasm, social-media, monitoring]
 10  status: draft
 11  ---
 12  
 13  # IronClaw Setup Plan
 14  
 15  **Parent plan:** [distributed-agent-system.md](distributed-agent-system.md) — Parts M (IronClaw
 16  replaces OpenClaw), N (Claude Code SSH + GPT-4o audit), O (security review mitigations).
 17  
 18  ## Context
 19  
 20  IronClaw (github.com/nearai/ironclaw) is a Rust + WASM agent framework replacing OpenClaw.
 21  This plan covers concrete deployment steps, security hardening, and OWASP Agentic mitigations.
 22  
 23  **Maturity warning:** IronClaw is 12 days old (v0.11.1 as of 2026-02-23). No third-party
 24  security audit exists. Apache-2.0 licensed, ~3,300 GitHub stars, ~85 contributors. Deploy
 25  only after the VPS NixOS reimage is stable and the off-host audit trail is verified.
 26  
 27  ---
 28  
 29  ## 1. Use Cases (To Be Finalized)
 30  
 31  Before deploying IronClaw, decide which use cases to enable. Each use case has different
 32  security implications (AA04, AA06 considerations noted inline):
 33  
 34  ### Use Case A: Social Media Posting for Non-Tech Colleagues
 35  
 36  **Interface:** Telegram (or WhatsApp) — colleagues chat with IronClaw to post to social
 37  platforms (Instagram, LinkedIn, TikTok, Facebook, Pinterest, YouTube).
 38  
 39  **AA06 (Excessive Agency) considerations:**
 40  
 41  - IronClaw can post to social media on behalf of the business
 42  - Posts should require approval workflow (IronClaw drafts → colleague confirms → post)
 43  - Limit: IronClaw cannot delete posts, change account passwords, or modify account settings
 44  - Rate limit: max N posts per day per platform (prevent accidental spam)
 45  
 46  **AA04 (Identity Spoofing) considerations:**
 47  
 48  - Only authorized Telegram user IDs can issue commands
 49  - IronClaw must verify the Telegram user ID against an allowlist before executing any action
 50  - Consider: require a daily PIN or passphrase to activate posting (in case Telegram account
 51    is compromised)
 52  
 53  ### Use Case B: VPS Monitoring + Health Alerts
 54  
 55  **Interface:** IronClaw Heartbeat system runs proactive checks. Alerts via Telegram.
 56  
 57  **AA06 considerations:**
 58  
 59  - IronClaw can read system metrics and logs (read-only)
 60  - IronClaw can restart specific services (allowlisted: `docker-pipeline`, `docker-cron`)
 61  - IronClaw CANNOT: modify firewall rules, change SSH config, access secrets, run
 62    arbitrary shell commands
 63  - All restart actions logged to Better Stack
 64  
 65  **AA04 considerations:**
 66  
 67  - Heartbeat is automated (no user interaction needed for monitoring)
 68  - Alert messages go to a private Telegram channel (not DMs)
 69  - Service restart commands require confirmation from Telegram channel admin
 70  
 71  ### Use Case C: VPS Maintenance Tasks
 72  
 73  **Interface:** Telegram commands for routine maintenance.
 74  
 75  **AA06 considerations:**
 76  
 77  - Disk cleanup: only specific temp directories (`/tmp`, `/var/log/journal` rotation)
 78  - Log rotation: trigger existing logrotate, not custom deletion
 79  - Backup verification: read-only check of restic snapshots
 80  - CANNOT: install packages, modify NixOS config, access database
 81  
 82  **AA04 considerations:**
 83  
 84  - Maintenance commands from admin Telegram user only
 85  - Each command category has its own authorization level
 86  
 87  ---
 88  
 89  ## 2. Kill Switch / Emergency Revocation
 90  
 91  ### Immediate Kill (< 30 seconds)
 92  
 93  ```bash
 94  # From any machine with SSH access to VPS:
 95  ssh admin@<vps-netbird-ip> 'sudo systemctl stop docker-ironclaw && sudo systemctl disable docker-ironclaw'
 96  
 97  # Or via Telegram bot admin command (if IronClaw Telegram interface is still responding):
 98  # Send: /killswitch <admin-pin>
 99  ```
100  
101  ### NixOS Kill Switch (survives reboot)
102  
103  ```nix
104  # In 333Method-infra/modules/containers.nix — set to true to kill IronClaw:
105  } // lib.optionalAttrs false {
106    ironclaw = { ... };  # IronClaw container definition inside the disabled block
107  };
108  ```
109  
110  Then: `nixos-rebuild switch --flake .#production`
111  
112  ### Credential Revocation
113  
114  If IronClaw is compromised, revoke credentials it may have accessed:
115  
116  1. **Telegram bot token:** Revoke via BotFather (`/revoke` command). Create new bot.
117  2. **Social media API keys:** Rotate in platform dashboards.
118  3. **Docker socket proxy:** Stop the proxy container — IronClaw loses all Docker introspection.
119  4. **sops-nix secrets:** IronClaw reads from `/run/secrets` via `IRONCLAW_SECRETS_DIR`.
120     If you suspect secrets were exfiltrated to LLM context (should be impossible with WASM
121     vault, but defence-in-depth):
122     - Rotate ALL secrets: `sops secrets/production.yaml` → change values → `nixos-rebuild switch`
123     - This rotates secrets for ALL services, not just IronClaw
124  
125  ### Monitoring for Kill Switch Triggers
126  
127  Auto-trigger kill switch if any of these conditions are detected:
128  
129  1. IronClaw attempts to access a path outside its allowed volumes
130  2. IronClaw generates output containing a canary token (see Part O4 in main plan)
131  3. IronClaw's WASM tool execution exceeds timeout thresholds (stuck/hanging = possible exploit)
132  4. Better Stack alert: IronClaw UID 9000 seen in unexpected auditd events
133  5. GPT-4o audit review returns risk_level >= 4 involving IronClaw
134  
135  ```nix
136  # modules/monitoring.nix — auto-kill IronClaw on canary detection
137  systemd.services.ironclaw-canary-watchdog = {
138    description = "Kill IronClaw if canary tokens detected in its output";
139    after = [ "docker-ironclaw.service" ];
140    wantedBy = [ "multi-user.target" ];
141    serviceConfig = {
142      Type = "simple";
143      Restart = "always";
144      RestartSec = 30;
145      ExecStart = pkgs.writeShellScript "ironclaw-canary-watch" ''
146        # Monitor IronClaw container logs for canary strings
147        CANARIES="CANARY_TOKEN_ALPHA|CANARY_TOKEN_BETA|CANARY_TOKEN_GAMMA"
148        journalctl -u docker-ironclaw -f --no-pager | while read -r line; do
149          if echo "$line" | grep -qE "$CANARIES"; then
150            logger -t ironclaw-canary "CRITICAL: Canary token detected in IronClaw output — killing container"
151            systemctl stop docker-ironclaw
152            systemctl disable docker-ironclaw
153            # Alert
154            curl -s https://in.logs.betterstack.com \
155              -H "Authorization: Bearer $BETTERSTACK_SOURCE_TOKEN" \
156              -d '{"dt":"'"$(date -Is)"'","level":"CRITICAL","message":"IronClaw killed: canary token detected in output"}'
157            break
158          fi
159        done
160      '';
161    };
162  };
163  ```
164  
165  ---
166  
167  ## 3. OWASP Agentic Top 10 Mitigations
168  
169  ### AA01: Memory Poisoning / Prompt Injection
170  
171  **Addressed in main plan Part O4.** Summary:
172  
173  - Input sanitization (500 char truncation on log entries)
174  - Ephemeral context (no persistent memory across sessions)
175  - Output validation (allowlisted commands, content policy)
176  - Canary tokens in sensitive files
177  
178  ### AA04: Identity Spoofing (Decide After Use Cases Finalized)
179  
180  **Threat:** An attacker impersonates an authorized user (Telegram account compromise, phone
181  number hijacking) and issues commands to IronClaw.
182  
183  **Mitigations to implement:**
184  
185  1. **Telegram user ID allowlist** (not just username — usernames can be changed):
186  
187     ```yaml
188     # ironclaw-config.yaml
189     authorized_users:
190       - telegram_id: 123456789 # Jason
191         role: admin
192         capabilities: [social_media, monitoring, maintenance, killswitch]
193       - telegram_id: 987654321 # Colleague
194         role: poster
195         capabilities: [social_media]
196     ```
197  
198  2. **Daily activation PIN:**
199     - IronClaw sends a 6-digit PIN to the admin's secondary channel (email or SMS) daily
200     - Admin must enter PIN in Telegram before IronClaw processes any commands that day
201     - PIN expires at midnight UTC
202     - Failed PIN attempts: 3 max, then lock out for 1 hour
203  
204  3. **Command confirmation for destructive actions:**
205     - Social media post: IronClaw shows preview → user replies "yes" → post published
206     - Service restart: IronClaw shows which service → user replies with service name → restart
207     - No single-message execution of destructive commands
208  
209  4. **Anomaly detection:**
210     - Commands from unusual hours (outside 7am-11pm local time) flagged
211     - Burst of commands (>10 in 5 minutes) triggers cooldown + admin alert
212     - First-time use of a capability triggers confirmation
213  
214  ### AA06: Excessive Agency / Principle of Least Agency (Decide After Use Cases Finalized)
215  
216  **Threat:** IronClaw has more permissions than needed for its tasks, increasing blast radius
217  if compromised.
218  
219  **Mitigations to implement:**
220  
221  1. **Capability-based WASM tool permissions:**
222  
223     ```yaml
224     # Per use case, IronClaw WASM tools get ONLY:
225     social_media:
226       network: [api.instagram.com, api.linkedin.com, api.tiktok.com]
227       filesystem: none
228       exec: none
229     monitoring:
230       network: [localhost:*] # Docker socket proxy only
231       filesystem: [/var/log:ro]
232       exec: none
233     maintenance:
234       network: none
235       filesystem: [/tmp:rw, /var/log/journal:rw]
236       exec: [logrotate, restic snapshots] # allowlisted binaries only
237     ```
238  
239  2. **No privilege escalation path:**
240     - IronClaw runs as UID 9000 (no sudo, no root)
241     - Docker socket proxy: `POST: 0`, `EXEC: 0`, `SECRETS: 0`, `ENV: 0`
242     - WASM tools cannot request capabilities beyond what's declared at install time
243  
244  3. **Blast radius containment:**
245     - If social media posting is compromised: only social accounts affected, not VPS
246     - If monitoring is compromised: read-only data exposed, no write access
247     - If maintenance is compromised: limited to temp cleanup, no system modification
248  
249  4. **Periodic capability audit:**
250     - Monthly review: which WASM tools are installed? Which capabilities do they have?
251     - Remove unused tools
252     - Compare installed tools against the allowlist in this document
253  
254  ---
255  
256  ## 4. Container Configuration
257  
258  ### NixOS Container Definition
259  
260  ```nix
261  # modules/containers.nix — IronClaw (KVM 2+ section)
262  ironclaw = {
263    image = "ghcr.io/nearai/ironclaw:latest";
264    volumes = [
265      "/opt/333method-ironclaw-workspace:/workspace"
266      "/opt/333method/src:/app/src:ro"
267      "/opt/333method/docs:/app/docs:ro"
268    ];
269    environment = {
270      DOCKER_HOST          = "tcp://docker-socket-proxy:2375";
271      IRONCLAW_CHANNEL     = "telegram";
272      IRONCLAW_SECRETS_DIR = "/run/secrets";
273      # IronClaw reads secrets at WASM execution boundary — never in LLM context
274    };
275    extraOptions = [
276      "--network=333method-openclaw-net"
277      "--user=9000:9000"
278      "--memory=512m"           # Hard memory limit
279      "--cpus=0.5"              # CPU limit (half a core)
280      "--read-only"             # Root filesystem read-only
281      "--tmpfs=/tmp:size=100m"  # Writable tmp, size-limited
282    ];
283    # /run/secrets NOT mounted as volume — IronClaw reads path from env;
284    # sops-nix places secrets at /run/secrets on host.
285    # IronClaw's secrets vault injects credentials at WASM tool execution boundary.
286  };
287  ```
288  
289  ### Resource Limits (new vs main plan)
290  
291  Added to container definition:
292  
293  - `--memory=512m` — prevents IronClaw from consuming VPS memory
294  - `--cpus=0.5` — limits CPU to half a core
295  - `--read-only` — root filesystem immutable (defence-in-depth atop WASM)
296  - `--tmpfs=/tmp:size=100m` — writable temp with size cap
297  
298  ### Secrets Required
299  
300  | Secret                    | sops-nix key               | Used by                          |
301  | ------------------------- | -------------------------- | -------------------------------- |
302  | Telegram bot token        | `telegram_bot_token`       | IronClaw Telegram interface      |
303  | Social media API keys     | TBD per platform           | IronClaw WASM social media tools |
304  | Better Stack source token | `betterstack_source_token` | Alert forwarding                 |
305  
306  ---
307  
308  ## 5. Deployment Sequence
309  
310  **Prerequisites:** VPS reimaged to NixOS (Part O1), NetBird VPN active, Better Stack
311  audit trail verified.
312  
313  1. **Pin IronClaw version** — do NOT use `:latest` in production. Pin to a specific release:
314  
315     ```nix
316     image = "ghcr.io/nearai/ironclaw:v0.11.1";  # or later stable release
317     ```
318  
319  2. **Deploy with monitoring only** (no social media, no maintenance):
320     - Enable Use Case B only (read-only monitoring + Telegram alerts)
321     - Verify: IronClaw container starts, connects to Telegram, sends health check
322     - Verify: auditd captures IronClaw execve events
323     - Verify: Better Stack receives IronClaw log stream
324     - Run for 1 week minimum before adding capabilities
325  
326  3. **Add social media posting** (Use Case A):
327     - Configure authorized Telegram users
328     - Install social media WASM tools with network allowlist
329     - Test: post to staging/sandbox accounts first
330     - Enable daily activation PIN
331     - Run for 1 week with approval workflow (IronClaw drafts, human confirms)
332  
333  4. **Add maintenance tasks** (Use Case C):
334     - Enable after 2+ weeks of stable monitoring
335     - Start with disk cleanup only
336     - Add log rotation after 1 week
337     - Add backup verification after 1 week
338  
339  5. **Security review after 1 month:**
340     - Review all IronClaw audit logs in Better Stack
341     - Check GPT-4o audit review reports for IronClaw-related findings
342     - Evaluate whether to commission a third-party pen test
343     - Update this plan with lessons learned
344  
345  ---
346  
347  ## 6. Rollback Plan
348  
349  If IronClaw causes issues at any stage:
350  
351  1. **Kill switch** (Section 2 above) — immediate stop
352  2. **Revert to manual** — social media posting goes back to manual
353  3. **Monitoring fallback** — cron.js Tier 1/2/3 watchdogs are still in codebase, re-enable
354  4. **Remove container** — set `lib.optionalAttrs false` around ironclaw in containers.nix
355  5. **Full rollback** — `nixos-rebuild switch --rollback` reverts to previous NixOS generation