High-level overview of our main strategic priorities for 2026. This roadmap is categorized by key themes and highlights both completed and upcoming initiatives for the Open-Source (OSS) Kubernetes Agent Sandbox.
Core platform capabilities, controllers, scheduling engines, and backend interfaces.
- Decouple API from Runtime (Portable Backend)
⏳ In Progress - SandboxTemplate & SandboxWarmPool Rolling Updates
⏳ In Progress- Support rolling updates on WarmPools and Templates to update sandbox pods without causing downtime or service disruption. [#323]
- 1st Class Router
📅 Planned- Support the sandbox-router as a first-class citizen within the project (written in Go, built with the rest of the project, and shipping with out-of-the-box images).
- Auto Suspend/Resume
📅 Planned- Automatically suspend inactive sandboxes and resume them upon traffic or API invocation.
- Smart Warmpool Selection
⏳ In Progress- Support intelligent warmpool matching and routing based on claim requirements. [#491]
- API Support for Multi-Sandbox per Pod
📅 Planned- Extend API models to support running and managing multiple isolated sandboxes inside a single Pod.
Developer interfaces, programming language SDKs, and application-level tooling.
- Expand Python SDK Functionality
⏳ In Progress- Natively support high-level convenience methods such as reading/writing files, executing commands (
run_code), and interactive tools.
- Natively support high-level convenience methods such as reading/writing files, executing commands (
- Typescript SDK Support
⏳ In Progress- Implement high-level TypeScript SDK support for modern web application frontends.
- Client Interface for SDK Language Alignment
📅 Planned- Establish robust mechanism/interfaces to minimize language diversion across Python, Go, and TypeScript SDKs.
- Agent Sandbox MCP (Model Context Protocol) Server
📅 Planned- Integrate an MCP server endpoint via the router or SDK, making Agent Sandbox a native tool for MCP-enabled LLM runtimes.
Optimizing the operational footprint, reducing latencies, and lowering cloud/infrastructure costs.
- Extended Benchmarking & Better Performance
📅 Planned- Benchmark large-scale workloads to identify performance bottlenecks, publish guidelines, and optimize controller throughput. Improve the controller to handle 1000+ claims per second.
- Improve Claim Latency (200ms ➔ 100ms ➔ 50ms)
📅 Planned- Analyze critical paths in the controller to reduce end-to-end sandbox assignment latencies down to sub-100ms.
- Scale to Zero
📅 Planned- Suspend sandboxes when inactive, preserving underlying resources while maintaining rapid resume paths.
- Measure & Improve TFFI (Time to First Instruction) Latency
📅 Planned- Define benchmarks and optimize the time required from invoking a sandbox to successfully executing the first code instruction.
- Support OpenClaw Price-Performance Targets
⏳ In Progress- Optimize base-image size, runtime overhead, and cold start times to support microVM environments targeting extremely low cost limits.
Advanced ingress/egress isolation, lifecycle state retention, and security controls.
- Sandbox / Pod Identity Association
📅 Planned- Enable dynamic sandbox/pod identity allocation at claim time (especially when provisioning from pre-warmed pools), associating the pod's security principal/identity based on the user/system making the SandboxClaim.
- Network Policy "Attach" at Claim Time
📅 Planned- Dynamic attachment of L4 and L7 egress/ingress NetworkPolicies at claim time to restrict internet access or whitelist specific FQDNs.
- Storage Customization at Claim Time
📅 Planned - Strict Sandbox-to-Pod Mapping
⏳ In Progress- Provide bulletproof, deterministic 1-to-1 mappings between a Sandbox claim and its backing Pod. [#127]
- Startup Actions
📅 Planned- Provide options for declarative startup routines, such as immediately pausing or scheduling suspension post-creation. [#58]
- Auto-Deletion of Bursty Sandboxes
📅 Planned- Support automatic time-based or inactivity-based cleanup (TTL) for highly dynamic workloads like RL training.
- Detailed Logs Falco Configuration Extension
📅 Planned- Propagate deep-level container security configurations (e.g., Falco) to enable robust gVisor auditing.
Audit trails, custom telemetry, reliability, and automated regression testing.
- Alpha to Beta API Versioning
⏳ In Progress- Evolve the existing API schemas from alpha status toward robust beta APIs with deprecation safety.
- Security Fixes
⏳ In Progress- Maintain active patching cycles for third-party dependencies and container base image security.
- CI for PodSnapshot & AgentSandbox Regression Prevention
⏳ In Progress- Introduce robust, isolated continuous integration tests to prevent regression.
- Controller Custom Metrics
⏳ In Progress- Track and expose standard metrics like sandbox creation latencies inside the controller. [#125]
- Additional Prometheus Telemetry
📅 Planned- Expose granular Prometheus counters to monitor API call frequencies, SDK usage, and overall controller performance.
Plugging into the broader AI Agent, reinforcement learning, and LLM framework ecosystem.
- Integration with Ray (Rllib)
⏳ In Progress- Seamless, high-performance container sandboxing for Ray training tasks.
- Integration with Agentic Frameworks
⏳ In Progress- Provide native runtime execution environment plugins for LangChain, CrewAI, OpenEnv, kAgent and other tool-calling systems.
- Expand Sandbox Use Cases
📅 Planned- Add curated base images and setups tailored for interactive browser use-cases, computer-use actions, and terminal shells.
Lowering the barrier to entry, beautiful guides, interactive tools, and UI dashboards.
- UI Support in OSS
📅 Planned- Build a lightweight open-source web dashboard/UI to visually inspect active sandboxes, warmpools, and templates.
- Publish Benchmarking Methodology & Guides
⏳ In Progress- Share systematic methodologies, configs, and reference results of running large-scale workloads.
- Reference Architectures
📅 Planned- Document production-ready reference designs for multi-user cloud environments.
- Golang SDK Support
✅ Completed- Deliver high-level Go client libraries to programmatically manage sandboxes and route connections. [#227]
- PyPI Distribution (
k8s-agent-sandbox)✅ Completed- Publish the client library to PyPI for seamless installation and usage. [#146]
- Runtime API OTEL/Tracing Instrumentation
✅ Completed- Fully instrument the sandbox runtime APIs using OpenTelemetry/Tracing to aid debugging.
- Metadata Propagation
✅ Completed- Ensure proper transmission of claim-level labels and annotations to underlying sandbox pods. [#173]
- Status Updates
✅ Completed- Properly reflect actual sandbox lifecycle phases (Pending, Ready, Suspended, etc.) within status structures. [#121]
- Integration with HPA & Cold Standby Nodes (CSN)
✅ Completed- Optimize the combination of warmpools with horizontal pod autoscaling and cold standby nodes to drastically reduce idle infrastructure costs.
- Controller Optimization for High-Throughput Claims
✅ Completed- Optimize the controller to handle extreme claim burst throughput (up to 300 sandboxes/second) without resource degradation.
- Suspend / Resume (PVC-based)
✅ Completed- Enable full state suspension preserving PVC storage: when scaled to 0, the PVC is persisted and cleanly attached back when resumed.
- Headless Service Port Handling
✅ Completed- Ensure headless services map configured containerPorts accurately for multi-port routing. [#156]
- Overhaul Documentation
✅ Completed- Restructure and write comprehensive, high-fidelity guides replicating clear, professional developer-oriented styles.
- Website Refresh
✅ Completed- Ensure that the public site reflects current API changes, usage examples, and best-practice architectures. [#166]