/ docs / architecture.md
architecture.md
  1  # OpenSandbox Architecture
  2  
  3  OpenSandbox is a universal sandbox platform designed for AI application scenarios, providing a complete solution with multi-language SDKs, standardized sandbox protocols, and flexible runtime implementations. This document describes the overall architecture and design philosophy of OpenSandbox.
  4  
  5  ## Architecture Overview
  6  
  7  ![OpenSandbox Architecture](assets/architecture.svg)
  8  
  9  The OpenSandbox architecture consists of four main layers:
 10  
 11  1. **SDKs Layer** - Client libraries for interacting with sandboxes
 12  2. **Specs Layer** - OpenAPI specifications defining the protocols
 13  3. **Runtime Layer** - Server implementations managing sandbox lifecycle
 14  4. **Sandbox Instances Layer** - Running sandbox containers with injected execution daemons
 15  
 16  ## 1. OpenSandbox SDKs
 17  
 18  The SDK layer provides high-level abstractions for developers to interact with sandboxes. It handles communication with both the Sandbox Lifecycle API and the Sandbox Execution API.
 19  
 20  ### Core SDK Components
 21  
 22  #### 1.1 Sandbox
 23  
 24  The `Sandbox` class is the primary entry point for managing sandbox lifecycle:
 25  
 26  - **Create**: Provision new sandbox instances from container images
 27  - **Manage**: Monitor sandbox state, renew expiration, retrieve endpoints
 28  - **Destroy**: Terminate sandbox instances when no longer needed
 29  
 30  **Key Features:**
 31  - Async/await support for non-blocking operations
 32  - Automatic state polling for provisioning progress
 33  - Resource quota management (CPU, memory, GPU)
 34  - Metadata and environment variable injection
 35  - TTL-based automatic expiration with renewal
 36  
 37  #### 1.2 Filesystem
 38  
 39  The `Filesystem` component provides comprehensive file operations within sandboxes:
 40  
 41  - **CRUD Operations**: Create, read, update, and delete files and directories
 42  - **Bulk Operations**: Upload/download multiple files efficiently
 43  - **Search**: Glob-based file searching with pattern matching
 44  - **Permissions**: Manage file ownership, group, and mode (chmod)
 45  - **Metadata**: Retrieve file info including size, timestamps, permissions
 46  
 47  **Use Cases:**
 48  - Uploading code files and dependencies
 49  - Downloading execution results and artifacts
 50  - Managing workspace directories
 51  - Searching for files by pattern
 52  
 53  #### 1.3 Commands
 54  
 55  The `Commands` component enables shell command execution within sandboxes:
 56  
 57  - **Foreground Execution**: Run commands synchronously with real-time output streaming
 58  - **Background Execution**: Launch long-running processes in detached mode
 59  - **Stream Support**: Capture stdout/stderr via Server-Sent Events (SSE)
 60  - **Process Control**: Interrupt running commands via context cancellation
 61  - **Working Directory**: Specify custom working directory for command execution
 62  
 63  **Use Cases:**
 64  - Running build commands (e.g., `npm install`, `pip install`)
 65  - Executing system utilities (e.g., `git`, `docker`)
 66  - Starting web servers or services
 67  - Running test suites
 68  
 69  #### 1.4 CodeInterpreter
 70  
 71  The `CodeInterpreter` component provides stateful code execution across multiple programming languages:
 72  
 73  - **Multi-Language Support**: Python, Java, JavaScript, TypeScript, Go, Bash
 74  - **Session Management**: Maintain execution state across multiple code blocks
 75  - **Jupyter Integration**: Built on Jupyter kernel protocol for robust execution
 76  - **Result Streaming**: Real-time output via SSE with execution counts
 77  - **Error Handling**: Structured error responses with tracebacks
 78  
 79  **Key Features:**
 80  - Variable persistence across executions within same session
 81  - Display data in multiple MIME types (text, HTML, images)
 82  - Execution interruption support
 83  - Execution timing and performance metrics
 84  
 85  **Use Cases:**
 86  - Interactive coding environments (e.g., Jupyter notebooks)
 87  - AI code generation and execution
 88  - Data analysis and visualization
 89  - Educational coding platforms
 90  
 91  ### SDK Language Support
 92  
 93  OpenSandbox provides SDKs in multiple languages:
 94  
 95  - **Python SDK** (`sdks/sandbox/python`, `sdks/code-interpreter/python`)
 96  - **Java/Kotlin SDK** (`sdks/sandbox/kotlin`, `sdks/code-interpreter/kotlin`)
 97  - **TypeScript SDK** (Roadmap)
 98  
 99  All SDKs follow the same design patterns and provide consistent APIs across languages.
100  
101  ## 2. OpenSandbox Specs
102  
103  The Specs layer defines two core OpenAPI specifications that establish the contract between SDKs and runtime implementations.
104  
105  ### 2.1 Sandbox Lifecycle Spec
106  
107  **File**: `specs/sandbox-lifecycle.yml`
108  
109  The Lifecycle Spec defines the API for managing sandbox instances throughout their lifecycle.
110  
111  #### Core Operations
112  
113  | Operation | Endpoint | Description |
114  |-----------|----------|-------------|
115  | **Create** | `POST /sandboxes` | Create a new sandbox from a container image |
116  | **List** | `GET /sandboxes` | List sandboxes with filtering and pagination |
117  | **Get** | `GET /sandboxes/{id}` | Retrieve sandbox details and status |
118  | **Delete** | `DELETE /sandboxes/{id}` | Terminate a sandbox |
119  | **Pause** | `POST /sandboxes/{id}/pause` | Pause a running sandbox |
120  | **Resume** | `POST /sandboxes/{id}/resume` | Resume a paused sandbox |
121  | **Renew** | `POST /sandboxes/{id}/renew-expiration` | Extend sandbox TTL |
122  | **Endpoint** | `GET /sandboxes/{id}/endpoints/{port}` | Get public URL for a port |
123  
124  ### 2.2 Sandbox Execution Spec
125  
126  **File**: `specs/execd-api.yaml`
127  
128  The Execution Spec defines the API for interacting with running sandbox instances. This API is implemented by the `execd` daemon injected into each sandbox.
129  
130  #### API Categories
131  
132  **Health**
133  - `GET /ping` - Health check
134  
135  **Code Interpreting**
136  - `POST /code/context` - Create execution context
137  - `POST /code` - Execute code with streaming output
138  - `DELETE /code` - Interrupt code execution
139  
140  **Command Execution**
141  - `POST /command` - Execute shell command
142  - `DELETE /command` - Interrupt command
143  
144  **Filesystem**
145  - `GET /files/info` - Get file metadata
146  - `DELETE /files` - Remove files
147  - `POST /files/permissions` - Change permissions
148  - `POST /files/mv` - Rename/move files
149  - `GET /files/search` - Search files by glob pattern
150  - `POST /files/replace` - Replace file content
151  - `POST /files/upload` - Upload files
152  - `GET /files/download` - Download files
153  - `POST /directories` - Create directories
154  - `DELETE /directories` - Remove directories
155  
156  **Metrics**
157  - `GET /metrics` - Get system metrics snapshot
158  - `GET /metrics/watch` - Stream metrics via SSE
159  
160  ## 3. OpenSandbox Runtime
161  
162  The Runtime layer implements the Sandbox Lifecycle Spec and manages the orchestration of sandbox containers.
163  
164  ### 3.1 Server Architecture
165  
166  **Location**: `server/`
167  
168  The OpenSandbox server is a FastAPI-based service providing:
169  
170  - **Lifecycle Management**: Create, monitor, pause, resume, and terminate sandboxes
171  - **Pluggable Runtimes**: Docker (production-ready), Kubernetes (production-ready)
172  - **Async Provisioning**: Background creation to reduce latency
173  - **Automatic Expiration**: Configurable TTL with renewal support
174  - **Access Control**: API key authentication
175  - **Observability**: Unified status tracking with transition logging
176  
177  ### 3.2 Runtime Implementations
178  
179  #### Docker Runtime (Ready)
180  
181  **Features:**
182  - Direct Docker API integration
183  - Two networking modes:
184    - **Host Mode**: Containers share host network (single instance)
185    - **Bridge Mode**: Isolated networking with HTTP routing
186  - Container lifecycle management
187  - Resource quota enforcement
188  - Private registry authentication
189  - Volume mounting for execd injection
190  - Automatic cleanup on expiration
191  
192  **Key Responsibilities:**
193  1. Pull container images (with auth support)
194  2. Create containers with resource limits
195  3. Inject execd binary and start script
196  4. Monitor container state
197  5. Handle pause/resume operations
198  6. Clean up terminated containers
199  
200  #### Kubernetes Runtime (Ready)
201  
202  **Features:**
203  - Built-in **[BatchSandbox](https://github.com/alibaba/OpenSandbox/tree/main/kubernetes)** runtime with sandbox pooling, high-throughput batch creation, and heterogeneous task orchestration; also compatible with **[SIG agent-sandbox](https://github.com/kubernetes-sigs/agent-sandbox)** as an alternative runtime
204  - Support for different secure container runtimes (e.g., kata-containers, gVisor)
205  - Helm-based deployment for controller and server, see [documentation](https://github.com/alibaba/OpenSandbox/blob/main/kubernetes/charts/opensandbox/README.md)
206  
207  **Planned Features:**
208  - Unified network storage mounting (ossfs, NAS, custom PVC) in both pooled and non-pooled modes
209  - Pause/resume support
210  
211  #### Custom Runtime
212  
213  The pluggable architecture allows implementing custom runtimes by:
214  1. Implementing the Lifecycle Spec APIs
215  2. Managing sandbox provisioning and cleanup
216  3. Injecting execd into sandbox instances
217  4. Reporting sandbox state transitions
218  
219  ### 3.3 Networking and Routing
220  
221  #### Sandbox Router
222  
223  **Purpose**: Provides HTTP/HTTPS load balancing to sandbox instance ports.
224  
225  **Features:**
226  - Dynamic endpoint generation based on sandbox ID and port
227  - Supports both domain-based and wildcard routing
228  - Reverse proxy to sandbox container ports
229  - Automatic cleanup when sandbox terminates
230  
231  **Endpoint Format**: `{domain}/sandboxes/{sandboxId}/port/{port}`
232  
233  **Use Cases:**
234  - Accessing web applications running in sandboxes
235  - Connecting to development servers (e.g., VS Code Server)
236  - Exposing APIs and services
237  - VNC and remote desktop access
238  
239  ## 4. Sandbox Instances
240  
241  Sandbox instances are running containers that host user workloads with an injected execution daemon.
242  
243  ### 4.1 Container Structure
244  
245  Each sandbox instance consists of:
246  
247  1. **Base Container**: User-specified image (e.g., `ubuntu:22.04`, `python:3.11`)
248  2. **execd Daemon**: Injected execution agent implementing the Execution Spec
249  3. **Entrypoint Process**: User-defined main process
250  
251  ### 4.2 execd - Execution Daemon
252  
253  **Location**: `components/execd/`
254  
255  execd is a Go-based HTTP daemon built on the Beego framework.
256  
257  #### Core Responsibilities
258  
259  1. **Code Execution**: Manage Jupyter kernel sessions for multi-language code execution
260  2. **Command Execution**: Run shell commands with output streaming
261  3. **File Operations**: Provide filesystem API for remote file management
262  4. **Metrics Collection**: Monitor and report CPU, memory usage
263  
264  #### Architecture
265  
266  **Technology Stack:**
267  - **Language**: Go 1.24+
268  - **Web Framework**: Beego
269  - **Jupyter Integration**: WebSocket-based Jupyter protocol client
270  - **Streaming**: Server-Sent Events (SSE)
271  
272  **Package Structure:**
273  - `pkg/flag/` - Configuration and CLI flags
274  - `pkg/web/` - HTTP layer (controllers, models, router)
275  - `pkg/runtime/` - Execution dispatcher
276  - `pkg/jupyter/` - Jupyter kernel client
277  - `pkg/util/` - Utilities and helpers
278  
279  #### Jupyter Integration
280  
281  execd integrates with Jupyter Server running inside the container:
282  
283  1. **Session Management**: Create and maintain kernel sessions
284  2. **WebSocket Communication**: Real-time bidirectional communication
285  3. **Message Protocol**: Jupyter message spec implementation
286  4. **Stream Parsing**: Parse execution results, outputs, errors
287  
288  **Supported Kernels:**
289  - Python (IPython)
290  - Java (IJava)
291  - JavaScript (IJavaScript)
292  - TypeScript (ITypeScript)
293  - Go (gophernotes)
294  - Bash
295  
296  ### 4.3 Injection Mechanism
297  
298  The execd daemon is injected into sandbox containers during creation:
299  
300  **Docker Runtime Injection Process:**
301  
302  1. **Pull execd Image**: Retrieve the execd container image
303  2. **Extract Binary**: Copy execd binary from image to temporary location
304  3. **Volume Mount**: Mount execd binary and startup script into target container
305  4. **Entrypoint Override**: Modify container entrypoint to start execd first
306  5. **User Process Launch**: execd forks and executes the user's entrypoint
307  
308  **Startup Sequence:**
309  
310  ```bash
311  # Container starts with modified entrypoint
312  /opt/opensandbox/start.sh
313314  # Start Jupyter Server
315  jupyter notebook --port=54321 --no-browser --ip=0.0.0.0
316317  # Start execd daemon
318  /opt/opensandbox/execd --jupyter-host=http://127.0.0.1:54321 --port=44772
319320  # Execute user entrypoint
321  exec "${USER_ENTRYPOINT[@]}"
322  ```
323  
324  **Benefits:**
325  - Transparent to user code
326  - No image modification required
327  - Dynamic injection at runtime
328  - Works with any base image
329  
330  ## 5. Communication Flow
331  
332  ### 5.1 Sandbox Creation Flow
333  
334  ```
335  User/SDK
336337     │ 1. POST /sandboxes (image, entrypoint, resources)
338339  Server (Lifecycle API)
340341     │ 2. Pull container image
342     │ 3. Inject execd binary
343     │ 4. Create container with entrypoint override
344     │ 5. Start container
345346  Sandbox Instance
347348     │ 6. Start execd daemon
349     │ 7. Start Jupyter Server
350     │ 8. Execute user entrypoint
351352  Running (State)
353  ```
354  
355  ### 5.2 Code Execution Flow
356  
357  ```
358  User/SDK
359360     │ 1. Create sandbox
361     │ 2. Get execd endpoint
362363  CodeInterpreter SDK
364365     │ 3. POST /code/context (create session)
366     │ 4. POST /code (execute code)
367368  execd (Execution API)
369370     │ 5. Route to Jupyter runtime
371372  Jupyter Runtime
373374     │ 6. WebSocket to Jupyter Server
375     │ 7. Send execute_request
376377  Jupyter Kernel (Python/Java/etc.)
378379     │ 8. Execute code
380     │ 9. Stream output events
381382  execd
383384     │ 10. Convert to SSE events
385     │ 11. Stream to client
386387  CodeInterpreter SDK
388389     │ 12. Parse events
390     │ 13. Return result to user
391392  User/Application
393  ```
394  
395  ### 5.3 File Operations Flow
396  
397  ```
398  User/SDK
399400     │ 1. Upload files
401402  Filesystem SDK
403404     │ 2. POST /files/upload (multipart)
405406  execd (Execution API)
407408     │ 3. Write to filesystem
409     │ 4. Set permissions
410411  Sandbox Container Filesystem
412  ```
413  
414  ## 6. Design Principles
415  
416  ### 6.1 Protocol-First Design
417  
418  - All interactions defined by OpenAPI specifications
419  - Clear contracts between components
420  - Enables polyglot implementations
421  - Supports custom runtime implementations
422  
423  ### 6.2 Separation of Concerns
424  
425  - **SDK**: Client-side abstraction and convenience
426  - **Specs**: Protocol definition and documentation
427  - **Runtime**: Sandbox orchestration and lifecycle
428  - **execd**: In-sandbox execution and operations
429  
430  ### 6.3 Extensibility
431  
432  - Pluggable runtime implementations
433  - Custom sandbox images
434  - Multiple SDK languages
435  - Additional Jupyter kernels
436  
437  ### 6.4 Security
438  
439  - API key authentication for lifecycle operations
440  - Token-based authentication for execution operations
441  - Isolated sandbox environments
442  - Resource quota enforcement
443  - Network isolation options
444  
445  ### 6.5 Observability
446  
447  - Structured state transitions
448  - Real-time metrics streaming
449  - Comprehensive logging
450  - Health check endpoints
451  
452  ## 7. Use Cases
453  
454  ### 7.1 AI Code Generation and Execution
455  
456  AI models (like Claude, GPT-4, Gemini) generate code that needs to be executed safely:
457  
458  - **Isolation**: Run untrusted AI-generated code in sandboxes
459  - **Multi-Language**: Support various programming languages
460  - **Iteration**: Maintain state across multiple code generations
461  - **Feedback**: Capture execution results and errors for AI refinement
462  
463  **Examples**: [claude-code](../examples/claude-code/), [gemini-cli](../examples/gemini-cli/), [codex-cli](../examples/codex-cli/)
464  
465  ### 7.2 Interactive Coding Environments
466  
467  Build web-based coding platforms and notebooks:
468  
469  - **Code Execution**: Run code in isolated environments
470  - **File Management**: Upload/download project files
471  - **Terminal Access**: Execute shell commands
472  - **Collaboration**: Share sandbox instances
473  
474  **Examples**: [code-interpreter](../examples/code-interpreter/)
475  
476  ### 7.3 Browser Automation and Testing
477  
478  Automate web browsers for testing and scraping:
479  
480  - **Headless Browsers**: Chrome, Playwright
481  - **Remote Debugging**: DevTools protocol
482  - **VNC Access**: Visual debugging
483  - **Network Isolation**: Controlled environment
484  
485  **Examples**: [chrome](../examples/chrome/), [playwright](../examples/playwright/)
486  
487  ### 7.4 Remote Development Environments
488  
489  Provide cloud-based development workspaces:
490  
491  - **VS Code Server**: Full IDE in browser
492  - **Desktop Environments**: VNC-based desktops
493  - **Tool Pre-installation**: Language runtimes, build tools
494  - **Port Forwarding**: Access development servers
495  
496  **Examples**: [vscode](../examples/vscode/), [desktop](../examples/desktop/)
497  
498  ### 7.5 Continuous Integration and Testing
499  
500  Run build and test pipelines in isolated environments:
501  
502  - **Reproducible Builds**: Consistent container images
503  - **Parallel Execution**: Multiple sandbox instances
504  - **Artifact Collection**: Download build outputs
505  - **Resource Limits**: Prevent resource exhaustion
506  
507  ## 8. Conclusion
508  
509  OpenSandbox provides a complete, production-ready platform for building AI-powered applications that require safe code execution, file management, and command execution in isolated environments. The architecture is designed to be:
510  
511  - **Universal**: Works with any container image
512  - **Extensible**: Pluggable runtimes and custom implementations
513  - **Developer-Friendly**: Multi-language SDKs with consistent APIs
514  - **Production-Ready**: Robust lifecycle management and observability
515  - **Secure**: Isolated environments with access control
516  
517  The protocol-first design ensures that all components can evolve independently while maintaining compatibility. Whether you're building AI coding assistants, interactive notebooks, or remote development environments, OpenSandbox provides the foundation you need.
518  
519  ## 9. References
520  
521  - [Contributing Guide](contributing.md)
522  - [Sandbox Lifecycle Spec](../specs/sandbox-lifecycle.yml)
523  - [Sandbox Execution Spec](../specs/execd-api.yaml)
524  - [Server Documentation](../server/README.md)
525  - [execd Documentation](../components/execd/README.md)
526  - [Python SDK](../sdks/sandbox/python/README.md)
527  - [Java/Kotlin SDK](../sdks/sandbox/kotlin/README.md)
528  - [Examples](../examples/README.md)