configuration.md
1 # OpenSandbox Server configuration reference 2 3 This document describes **all TOML configuration options** accepted by the OpenSandbox lifecycle server (`opensandbox-server`). The schema is defined in [`opensandbox_server/config.py`](opensandbox_server/config.py) (`AppConfig` and nested models). 4 5 - **Default config path:** `~/.sandbox.toml` 6 - **Override path:** set environment variable `SANDBOX_CONFIG_PATH` to an absolute or user-expandable path. 7 - **CLI:** `opensandbox-server --config /path/to/sandbox.toml` also sets `SANDBOX_CONFIG_PATH` for that process. 8 9 Example files in this repository: 10 11 | File | Purpose | 12 |------|---------| 13 | [`example.config.toml`](example.config.toml) | Docker runtime (English) | 14 | [`example.config.zh.toml`](example.config.zh.toml) | Docker runtime (中文) | 15 | [`example.config.k8s.toml`](example.config.k8s.toml) | Kubernetes runtime (English) | 16 | [`example.config.k8s.zh.toml`](example.config.k8s.zh.toml) | Kubernetes runtime (中文) | 17 18 --- 19 20 ## Table of contents 21 22 1. [Top-level sections](#top-level-sections) 23 2. [`[server]`](#server--lifecycle-api) 24 3. [`[log]`](#log) 25 4. [`[runtime]`](#runtime--required) 26 5. [`[docker]`](#docker--only-when-runtime--docker) 27 6. [`[kubernetes]`](#kubernetes--only-when-runtime--kubernetes) 28 7. [`[agent_sandbox]`](#agent_sandbox--only-with-kubernetes--agent-sandbox) 29 8. [`[ingress]`](#ingress) 30 9. [`[egress]`](#egress) 31 10. [`[storage]`](#storage) 32 11. [`[secure_runtime]`](#secure_runtime) 33 12. [`[renew_intent]`](#renew_intent--experimental) 34 13. [Environment variables (outside TOML)](#environment-variables-outside-toml) 35 14. [Cross-field validation rules](#cross-field-validation-rules) 36 37 --- 38 39 ## Top-level sections 40 41 | Section | Required | When | 42 |---------|----------|------| 43 | `[server]` | No | Always (defaults apply if omitted) | 44 | `[log]` | No | Always (defaults apply if omitted) | 45 | `[runtime]` | **Yes** | Always | 46 | `[docker]` | No | `runtime.type = "docker"` | 47 | `[kubernetes]` | No | `runtime.type = "kubernetes"` (defaults are applied if missing) | 48 | `[agent_sandbox]` | No | Only when `kubernetes.workload_provider = "agent-sandbox"` | 49 | `[ingress]` | No | Optional; see [Ingress](#ingress) | 50 | `[egress]` | No | Required values when clients use `networkPolicy` on create | 51 | `[storage]` | No | Host bind mounts / OSSFS mount root | 52 | `[secure_runtime]` | No | gVisor / Kata / Firecracker | 53 | `[renew_intent]` | No | Experimental auto-renew on access | 54 55 --- 56 57 ## `[server]` — Lifecycle API 58 59 | Key | Type | Default | Description | 60 |-----|------|---------|-------------| 61 | `host` | string | `"0.0.0.0"` | Bind address for the HTTP API. | 62 | `port` | integer | `8080` | Listen port (1–65535). | 63 | `api_key` | string \| omitted | `null` | If set to a non-empty string, requests must send header `OPEN-SANDBOX-API-KEY` with this value (except documented public routes such as `/health`, `/docs`, `/redoc`). If omitted or empty, API key checks are skipped, but startup now requires explicit risk acknowledgment: interactive TTY confirmation (`YES`) or `OPENSANDBOX_INSECURE_SERVER=YES`. | 64 | `eip` | string \| omitted | `null` | Public IP or hostname used as the **host part** when the server returns sandbox endpoint URLs (notably Docker runtime). | 65 | `max_sandbox_timeout_seconds` | integer \| omitted | `null` | Upper bound on sandbox TTL in seconds for **create** requests that specify `timeout`. Must be ≥ **60** if set. Omit to disable the server-side cap. | 66 | `timeout_keep_alive` | integer | `30` | Idle keep-alive timeout (seconds) passed to uvicorn. | 67 68 --- 69 70 ## `[log]` 71 72 | Key | Type | Default | Description | 73 |-----|------|---------|-------------| 74 | `level` | string | `"INFO"` | Python logging level for the server process (e.g. `"DEBUG"`, `"INFO"`, `"WARNING"`). | 75 | `file_enabled` | boolean | `false` | When `true`, logs are written to rotating files instead of stdout. | 76 | `file_path` | string \| omitted | `null` | Override path for the main log file. Defaults to `~/logs/opensandbox/server.log` when `file_enabled = true`. | 77 | `access_file_path` | string \| omitted | `null` | Override path for the HTTP access log file. Defaults to `~/logs/opensandbox/access.log` when `file_enabled = true`. | 78 | `file_max_bytes` | integer | `104857600` (100 MB) | Max bytes per log file before rotation. | 79 | `file_backup_count` | integer | `5` | Number of rotated log files to retain. | 80 81 --- 82 83 ## `[runtime]` — **required** 84 85 | Key | Type | Default | Description | 86 |-----|------|---------|-------------| 87 | `type` | string | — | **`docker`** or **`kubernetes`**. Selects which runtime implementation loads. | 88 | `execd_image` | string | — | OCI image containing the **execd** binary used to bootstrap command/file access inside the sandbox. | 89 90 --- 91 92 ## `[docker]` — only when `runtime.type = "docker"` 93 94 | Key | Type | Default | Description | 95 |-----|------|---------|-------------| 96 | `network_mode` | string | `"host"` | Docker network attachment for sandbox containers: **`host`**, **`bridge`**, or a **custom user-defined network name**. Egress sidecar + `networkPolicy` require **`bridge`** (see [Egress](#egress)). | 97 | `api_timeout` | integer \| omitted | `null` | Docker API timeout in **seconds**. If unset, the code uses default **180** s where applicable. | 98 | `host_ip` | string \| omitted | `null` | Hostname or IP used when **rewriting** bridge-mode endpoint URLs (e.g. server runs in Docker and clients need a host-reachable address). Often `host.docker.internal` or the host LAN IP on Linux. | 99 | `drop_capabilities` | list of strings | See `config.py` | Linux capabilities **dropped** from sandbox containers (security hardening). | 100 | `apparmor_profile` | string \| omitted | `null` | Optional AppArmor profile name (e.g. `"docker-default"`). Empty/unset lets Docker use its default. | 101 | `no_new_privileges` | boolean | `true` | Sets `no-new-privileges` to block privilege escalation. | 102 | `seccomp_profile` | string \| omitted | `null` | Seccomp profile name or **absolute path**; empty uses Docker default seccomp. | 103 | `pids_limit` | integer \| null | `4096` | Max PIDs per sandbox container; set to **`null`** to disable the limit. | 104 105 --- 106 107 ## `[kubernetes]` — only when `runtime.type = "kubernetes"` 108 109 If `runtime.type = "kubernetes"` and the `[kubernetes]` table is absent, the server instantiates defaults from `KubernetesRuntimeConfig`. 110 111 | Key | Type | Default | Description | 112 |-----|------|---------|-------------| 113 | `kubeconfig_path` | string \| omitted | `null` | Path to kubeconfig (expandable, e.g. `~/.kube/config`). In-cluster configs often leave this unset and rely on in-cluster credentials. | 114 | `namespace` | string \| omitted | `null` | Namespace for sandbox workloads. | 115 | `service_account` | string \| omitted | `null` | ServiceAccount name bound to workload pods. | 116 | `workload_provider` | string \| omitted | `null` | One of: **`batchsandbox`**, **`agent-sandbox`**. If omitted, the **first registered** provider is used (currently **`batchsandbox`**). | 117 | `batchsandbox_template_file` | string \| omitted | `null` | Path to **BatchSandbox** CR YAML template when `workload_provider = "batchsandbox"`. | 118 | `sandbox_create_timeout_seconds` | integer | `60` | Max time to wait for a new sandbox to become ready (e.g. IP assigned), in seconds. | 119 | `sandbox_create_poll_interval_seconds` | float | `1.0` | Poll interval while waiting for readiness. | 120 | `informer_enabled` | boolean | `true` | **[Beta]** Use informer/watch cache for reads to reduce API load. | 121 | `informer_resync_seconds` | integer | `300` | **[Beta]** Full resync period for the informer cache. | 122 | `informer_watch_timeout_seconds` | integer | `60` | **[Beta]** Watch stream restart interval. | 123 | `read_qps` | float | `0` | K8s API **get/list** rate limit (QPS). **0** = unlimited. | 124 | `read_burst` | integer | `0` | Burst for read limiter; **0** means use `read_qps` as burst (minimum 1 internally). | 125 | `write_qps` | float | `0` | K8s API **write** rate limit (QPS). **0** = unlimited. | 126 | `write_burst` | integer | `0` | Burst for write limiter. | 127 | `execd_init_resources` | table \| omitted | `null` | Optional resource requests/limits for the **execd init** container. | 128 129 ### BatchSandbox vs agent-sandbox 130 131 Kubernetes workloads are created by a **workload provider**. There is **no** `[batchsandbox]` section in TOML — BatchSandbox is configured entirely under **`[kubernetes]`**, plus shared sections like `[egress]`, `[ingress]`, `[storage]`, `[secure_runtime]`. 132 133 | | **BatchSandbox** (default provider) | **agent-sandbox** ([kubernetes-sigs/agent-sandbox](https://github.com/kubernetes-sigs/agent-sandbox)) | 134 |--|--------------------------------------|--------------------------------------------------------------------------------------------------------| 135 | `kubernetes.workload_provider` | `"batchsandbox"` or **omit** (factory default is `batchsandbox`) | `"agent-sandbox"` | 136 | Template file | **`kubernetes.batchsandbox_template_file`** — path to **BatchSandbox** CR YAML | **`agent_sandbox.template_file`** in [`[agent_sandbox]`](#agent_sandbox--only-with-kubernetes--agent-sandbox) | 137 | Extra TOML table | None | **`[agent_sandbox]`** is required (see below) | 138 139 **BatchSandbox-only config key in `config.py`:** `batchsandbox_template_file` on `KubernetesRuntimeConfig`. Everything else in the `[kubernetes]` table (namespace, kubeconfig, informer, API QPS, `sandbox_create_*`, `execd_init_resources`, …) applies to **whichever** provider you select. 140 141 ### `kubernetes.execd_init_resources` 142 143 | Key | Type | Description | 144 |-----|------|-------------| 145 | `limits` | map string → string | e.g. `{ cpu = "100m", memory = "128Mi" }` | 146 | `requests` | map string → string | e.g. `{ cpu = "50m", memory = "64Mi" }` | 147 148 --- 149 150 ## `[agent_sandbox]` — only with `kubernetes.workload_provider = "agent-sandbox"` 151 152 Used with the **kubernetes-sigs/agent-sandbox** Sandbox CRD provider. 153 154 | Key | Type | Default | Description | 155 |-----|------|---------|-------------| 156 | `template_file` | string \| omitted | `null` | Path to **Sandbox CR** YAML template. | 157 | `shutdown_policy` | string | `"Delete"` | **`Delete`** or **`Retain`** when the sandbox expires. | 158 | `ingress_enabled` | boolean | `true` | Whether ingress routing to agent-sandbox pods is expected. | 159 160 --- 161 162 ## `[ingress]` 163 164 Controls how **ingress exposure** is described for sandbox endpoints (especially behind gateways). **When `runtime.type = "docker"`, only `mode = "direct"` is allowed.** 165 `secureAccess` is currently supported only for **Kubernetes** sandboxes when **`ingress.mode = "gateway"`**. 166 167 | Key | Type | Default | Description | 168 |-----|------|---------|-------------| 169 | `mode` | string | `"direct"` | **`direct`** — clients reach sandboxes without an L7 gateway configured here. **`gateway`** — use `[ingress.gateway]` for address and routing mode (Kubernetes-oriented deployments). | 170 171 ### When `mode = "gateway"` 172 173 You must set **`[ingress.gateway]`** and omit gateway when `mode = "direct"`. 174 175 | Key | Type | Description | 176 |-----|------|-------------| 177 | `address` | string | Gateway host (**no `http://` or `https://`**). For `route.mode = "wildcard"`, must be a **wildcard domain** (e.g. `*.example.com`). Otherwise a normal domain, IP, or `IP:port`. | 178 | `route.mode` | string | **`wildcard`** — host-based routing; **`uri`** — path-prefix routing; **`header`** — header-based routing. | 179 180 Response URL shapes depend on `route.mode` (see server README / ingress component docs). 181 182 --- 183 184 ## `[egress]` 185 186 Configures the **egress sidecar** image and enforcement mode. The server only attaches the sidecar when a sandbox is created **with** a `networkPolicy` in the API request. 187 188 | Key | Type | Default | Description | 189 |-----|------|---------|-------------| 190 | `image` | string \| omitted | `null` | OCI image for the egress sidecar. **Required in config** when clients send **`networkPolicy`** (create request). | 191 | `mode` | string | `"dns"` | Passed to the sidecar as `OPENSANDBOX_EGRESS_MODE`. Values: **`dns`** — DNS-proxy-based enforcement (CIDR/static IP rules **not** enforced); **`dns+nft`** — adds nftables where available so **CIDR/IP** rules can be enforced. | 192 | `disable_ipv6` | bool | `true` | IPv6 egress is incomplete (especially on Kubernetes). **Default on**; set `false` only when you want IPv6 left up in the netns. Details in [IPv6 and egress](#ipv6-and-egress) below. | 193 194 ### IPv6 and egress 195 196 OpenSandbox egress does **not** treat IPv6 as a first-class, fully covered path—gaps show up most often under **`runtime.type = "kubernetes"`** (pod networking, CNI). The default **`disable_ipv6 = true`** matches the usual need on **dual-stack** CNI: do not rely on incomplete IPv6 egress. Set **`false`** when the cluster is effectively **IPv4-only** and you deliberately want IPv6 enabled in the sandbox network namespace, or when you accept those gaps for experiments. 197 198 **Docker notes:** 199 200 - `egress.image` must be set when using `networkPolicy`. 201 - Outbound policy requires **`docker.network_mode = "bridge"`**; `networkPolicy` is rejected for incompatible network modes. 202 203 **Kubernetes notes:** 204 205 - When `networkPolicy` is set, the workload includes an egress sidecar built from `egress.image`. 206 207 See [`components/egress/README.md`](../components/egress/README.md) for sidecar behavior and limits. 208 209 --- 210 211 ## `[storage]` 212 213 Host-side storage related to **volume mounts** (host bind allowlist and OSSFS mount layout). 214 215 | Key | Type | Default | Description | 216 |-----|------|---------|-------------| 217 | `allowed_host_paths` | list of strings | `[]` | Absolute path **prefixes** allowed for **host** bind mounts. If **empty**, all host bind mounts are rejected (secure-by-default). | 218 | `ossfs_mount_root` | string | `"/mnt/ossfs"` | Host directory under which OSSFS-backed mounts are resolved (`<root>/<bucket>/...`). | 219 | `volume_default_size` | string | `"1Gi"` | Default storage size for auto-created Kubernetes PVCs when the caller does not specify a size in the PVC provisioning hints. | 220 221 Sandbox **volume** models (`host`, `pvc`, `ossfs`) in API requests are documented in the OpenAPI specs and OSEPs; this table only covers **server** storage settings. 222 223 --- 224 225 ## `[secure_runtime]` 226 227 Optional **strong isolation** runtimes (gVisor, Kata, Firecracker). 228 229 | Key | Type | Default | Description | 230 |-----|------|---------|-------------| 231 | `type` | string | `""` | **`""`** — default OCI runtime (runc). **`gvisor`**, **`kata`**, **`firecracker`**. **`firecracker`** is **Kubernetes-only**. | 232 | `docker_runtime` | string \| omitted | `null` | Docker **OCI runtime name** (e.g. `runsc` for gVisor, `kata-runtime` for Kata). | 233 | `k8s_runtime_class` | string \| omitted | `null` | Kubernetes **RuntimeClass** name (e.g. `gvisor`, `kata-qemu`, `kata-fc`). | 234 235 **Validation (summary):** 236 237 - If `type` is empty, **`docker_runtime`** and **`k8s_runtime_class`** must be omitted. 238 - If `type` is **`firecracker`**, **`k8s_runtime_class`** is **required** (`docker` runtime cannot use Firecracker). 239 - If `type` is **`gvisor`** or **`kata`**, at least one of **`docker_runtime`** or **`k8s_runtime_class`** must be set. 240 241 See [`docs/secure-container.md`](../docs/secure-container.md) for installation and node requirements. 242 243 --- 244 245 ## `[renew_intent]` — **experimental** 246 247 **🧪 Experimental:** auto-renew sandbox expiration when access is observed (lifecycle proxy and/or Redis queue). Off by default. Full design: [OSEP-0009](../oseps/0009-auto-renew-sandbox-on-ingress-access.md). 248 249 Use **dotted keys** under the same table for Redis (valid in TOML): 250 251 | Key | Type | Default | Description | 252 |-----|------|---------|-------------| 253 | `enabled` | boolean | `false` | Master switch for renew-on-access. | 254 | `min_interval_seconds` | integer | `60` | Minimum seconds between renewals for the same sandbox (cooldown). ≥ 1. | 255 | `redis.enabled` | boolean | `false` | Enable Redis list consumer for ingress-gateway renew intents. | 256 | `redis.dsn` | string \| omitted | `null` | Redis URL, e.g. `redis://127.0.0.1:6379/0`. **Required** when `redis.enabled = true`. | 257 | `redis.queue_key` | string | `"opensandbox:renew:intent"` | Redis list key for renew-intent payloads. | 258 | `redis.consumer_concurrency` | integer | `8` | Concurrent BRPOP workers (≥ 1). | 259 260 Per-sandbox enablement uses create request extensions (see OSEP-0009 and `example.config.toml` comments). 261 262 --- 263 264 ## Environment variables (outside TOML) 265 266 These are read by the server or runtime code in addition to the TOML file: 267 268 | Variable | Where used | Description | 269 |----------|------------|-------------| 270 | `SANDBOX_CONFIG_PATH` | `config.py`, CLI | Path to the TOML file. Overrides the default `~/.sandbox.toml` when set. | 271 | `DOCKER_HOST` | Docker service | Standard Docker daemon address (e.g. `unix:///var/run/docker.sock`). | 272 | `PENDING_FAILURE_TTL` | Docker service | Seconds to retain **failed Pending** sandboxes before cleanup; default **`3600`**. | 273 274 --- 275 276 ## Cross-field validation rules 277 278 Rules enforced when the full `AppConfig` is parsed (see `AppConfig.validate_runtime_blocks` in `config.py`): 279 280 1. **`runtime.type = "docker"`** 281 - Must **not** include `[kubernetes]` or `[agent_sandbox]`. 282 - If `[ingress]` is present, **`ingress.mode` must be `"direct"`**. 283 - **`secure_runtime.type = "firecracker"`** is not allowed. 284 285 2. **`runtime.type = "kubernetes"`** 286 - `[kubernetes]` is created with defaults if missing. 287 - `[agent_sandbox]` is **only** allowed when **`kubernetes.workload_provider`** (case-insensitive) is **`agent-sandbox`**. 288 289 3. **`ingress.mode = "gateway"`** 290 - `[ingress.gateway]` is **required**; address and `route.mode` must satisfy the validators (wildcard domain for `wildcard` route mode, no URL scheme in `address`, etc.). 291 292 4. **`secure_runtime`** 293 - See [Secure runtime](#secure_runtime) above. 294 295 --- 296 297 ## Source of truth 298 299 If this document and the running server disagree, prefer: 300 301 1. **`opensandbox_server/config.py`** — authoritative Pydantic schema and defaults. 302 2. **Example TOML files** in the `server/` directory — reviewed snapshots for Docker/K8s. 303 3. **Release notes** — for experimental flags and breaking changes. 304 305 For API request fields (create sandbox, `networkPolicy`, volumes, etc.), see the OpenAPI specs under [`specs/`](../specs/) and the main [Server README](README.md).