docs/craft/features/docker-sandbox-backend.md
Main goal: make Onyx Craft package cleanly for docker compose deployments with a containerized sandbox backend, while keeping the implementation as close as practical to the current Kubernetes Craft architecture.
Decision update: Docker authority will live in api_server. The api server will mount the Docker socket and use a DockerSandboxManager directly, analogous to how KubernetesSandboxManager directly talks to the Kubernetes API today.
Research checked:
docker-compose.yml + install.sh for Craft wiring:
code-interpreter already uses Docker-out-of-Docker by mounting ${DOCKER_SOCK_PATH:-/var/run/docker.sock}.ENABLE_CRAFT, but does not set SANDBOX_BACKEND; the code default is local.--include-craft selects craft-latest and sets ENABLE_CRAFT=true, but does not provision an isolated Craft sandbox backend.KubernetesSandboxManager (backend/onyx/server/features/build/sandbox/kubernetes/):
sandbox container for the agent and a sidecar container for push/snapshot HTTP on port 8731.local, so the agent runs inside the api_server container/process boundary.--include-craft.Direct Docker manager in api_server:
api_server
- SandboxBackend.DOCKER
- DockerSandboxManager
- mounts /var/run/docker.sock
|
| Docker Engine API
v
sandbox-<id8>
- image: onyxdotapp/sandbox:<tag>
- one container per user/sandbox
- named volume mounted at /workspace/sessions
- no Docker socket
- no S3/MinIO credentials
- K8s-equivalent resource defaults for now
This is the closest docker-compose equivalent to Kubernetes:
Kubernetes:
api_server -> Kubernetes API -> sandbox pod
Docker compose:
api_server -> Docker Engine API -> sandbox container
The initial Docker implementation should use one sandbox container per user, not one container per session. This matches the K8s model, where one sandbox pod contains /workspace/sessions/<session_id> directories for multiple sessions.
For Docker V1, do not require a per-sandbox sidecar unless implementation proves it is materially simpler.
Preferred V1:
api_server -> Docker exec / Docker API -> sandbox container
Why:
Keep the design compatible with adding sidecars later:
sandbox-<id8>-sidecar.api_server:
${DOCKER_SOCK_PATH:-/var/run/docker.sock}:/var/run/docker.sock when Craft is enabled.SANDBOX_BACKEND=docker.SANDBOX_CONTAINER_IMAGE=onyxdotapp/sandbox:<tag>.background:
SANDBOX_BACKEND=docker config if celery tasks instantiate get_sandbox_manager() for idle cleanup/snapshotting.code-interpreter:
install.sh:
--include-craft selects craft-latest, sets ENABLE_CRAFT=true, sets SANDBOX_BACKEND=docker, and ensures Docker socket env guidance exists.CRAFT_ALLOW_UNBLOCKED_IMDS=true is explicitly set.Keep current K8s-style sandbox expectations for now:
/workspace/sessionsRecommended envs for future tuning, even if defaults match K8s:
SANDBOX_DOCKER_CPU_LIMITSANDBOX_DOCKER_MEMORY_LIMITSANDBOX_DOCKER_NETWORKSANDBOX_DOCKER_VOLUME_PREFIXSANDBOX_DOCKER_BLOCK_IMDSV1 should prioritize a clear separation between Onyx services and agent containers.
Recommended Docker network shape:
onyx_craft_sandbox.Important security note:
169.254.169.254) unless blocked at the host or Docker daemon/network layer.DOCKER-USER rules, they must cover every Docker bridge that sandbox traffic can use, and verification must run from inside an actual sandbox container.Use api_server-owned FileStore streaming for Docker V1:
DockerSandboxManager.create_snapshot(...)
docker exec runs tar inside the sandbox container.FileStore.save_file(...).DockerSandboxManager.restore_snapshot(...)
tar -x inside the sandbox container.Do not use aws s3 cp inside the sandbox agent for Docker V1. That would require storage credentials near the untrusted workload and would only support S3-like stores.
The existing SnapshotManager needs stream helpers:
create_snapshot_from_stream(stream, sandbox_id, tenant_id, size_hint=None)restore_snapshot_to_stream(storage_path, write_stream)SandboxBackend.DOCKER.list_session_workspaces(sandbox_id) -> list[UUID] to SandboxManager._list_session_directories logic out of sandbox/tasks/tasks.py and onto KubernetesSandboxManager.cleanup_idle_sandboxes_task so it works for K8s and Docker; local remains cleanup-disabled.SnapshotManager stream helpers.New module:
backend/onyx/server/features/build/sandbox/docker/docker_sandbox_manager.pybackend/onyx/server/features/build/sandbox/docker/internal/acp_exec_client.pybackend/onyx/server/features/build/sandbox/docker/internal/exec_helpers.pyManager responsibilities:
docker.from_env()onyx.app/component=craft-sandboxonyx.app/sandbox-id=<uuid>onyx.app/tenant-id=<tenant>onyx.app/user-id=<uuid>--security-opt no-new-privileges--cap-drop ALL--user 1000:1000deployment/docker_compose/docker-compose.yml:
api_server when Craft is enabledbackground if idle cleanup runs thereSANDBOX_BACKEND=${SANDBOX_BACKEND:-docker} when Craft is enabledSANDBOX_CONTAINER_IMAGEdeployment/docker_compose/env.template:
SANDBOX_BACKEND=dockerdeployment/docker_compose/install.sh:
--include-craft writes SANDBOX_BACKEND=dockerlocal: dev only, no isolationkubernetes: Helm/cloud, api_server talks to K8sdocker: docker-compose, api_server talks to Docker EngineTesting is a first-class part of this project. The Docker backend should not be merged as "manual smoke only"; it needs unit coverage for selection/refactor behavior and external-dependency-unit coverage against a real Docker daemon.
Target location:
backend/tests/unit/onyx/server/features/build/sandbox/Tests to add/update:
test_sandbox_backend_selection.py
SANDBOX_BACKEND=SandboxBackend.DOCKERget_sandbox_manager() returns DockerSandboxManagertest_idle_cleanup_backend_abstraction.py
SandboxManagercleanup_idle_sandboxes_task calls list_session_workspaces() rather than the deleted K8s-only helpertest_snapshot_manager_streams.py
FileStorecreate_snapshot_from_stream(...) saves bytes with FileOrigin.SANDBOX_SNAPSHOTrestore_snapshot_to_stream(...) writes the stored bytes to the provided stream/writertest_docker_manager_config.py
cap_drop, no-new-privileges, user=1000:1000, memory/CPU settings, labels, and no Docker socket mounttest_docker_acp_exec_client.py
Target location:
backend/tests/external_dependency_unit/craft/test_docker_sandbox.pybackend/tests/external_dependency_unit/craft/docker/Skip policy:
SANDBOX_BACKEND=docker or an explicit test env like RUN_DOCKER_SANDBOX_TESTS=true is set.Required fixtures:
LLMProviderConfig fixture.Lifecycle tests:
test_provision_creates_container_volume_and_network
provision()/workspace/sessionstest_provision_is_idempotent
provision() twicetest_terminate_removes_container_and_volume
Workspace and file tests:
test_setup_session_workspace_creates_expected_layout
find /workspace/sessions/<session_id> or use manager read/list APIsoutputs, attachments, .opencode/skills, AGENTS.md, and opencode.json existtest_file_operations_round_trip
test_list_session_workspaces_filters_uuid_dirs
ACP and preview tests:
test_acp_exec_smoke
DockerACPExecClienttest_nextjs_preview_path
Snapshot tests:
test_snapshot_round_trip
outputs and attachmentscreate_snapshot()restore_snapshot()test_snapshot_does_not_include_generated_runtime_state
Security/isolation tests:
test_sandbox_container_has_no_docker_socket_or_storage_env
/var/run/docker.sockS3_AWS_ACCESS_KEY_ID, S3_AWS_SECRET_ACCESS_KEY, MINIO_ROOT_PASSWORD, or equivalent FileStore secretstest_sandbox_container_security_options
test_network_isolation_from_compose_services
curl compose service names such as api_server, relational_db, cache, and miniotest_imds_blocking_when_enabled
169.254.169.254 is unreachable from inside the sandbox when SANDBOX_DOCKER_BLOCK_IMDS=trueManual EC2 smoke:
curl -fsSL https://onyx.app/install_onyx.sh | bash -s -- --include-craft.api_server has Docker socket mount./api/build/sessions/{id}/webapp.DockerSandboxManager and Docker external-dependency tests.--include-craft.| PR # | Title | Status |
|---|---|---|
#11218 (1fce3ba78b) | feat(craft): backend-agnostic sandbox cleanup + snapshot stream helpers | merged to main |
d08a5ee078 | feat(craft): docker-compose sandbox backend (manager + compose/install/env wiring) | open on docker-compose-2 (#11222) |
d881a81a03 | refactor(craft): simplify exec helpers and sandbox manager | open in same stack |
1465bd7a61 | chore(craft): drop CRAFT_ALLOW_UNBLOCKED_IMDS opt-out | open in same stack |
e1d5bd22db | chore(craft): remove SANDBOX_DOCKER_BLOCK_IMDS | open in same stack |
d94321e924 | refactor(craft): shared ACPExecClient base across K8s + Docker | open on docker-compose-3 (#11225) — tracked separately in shared-acp-exec-client.md |
2c49919b10 | docs(craft): document docker sandbox backend | open on docker-compose-2b |
PR 3 (compose/install/env wiring) shipped inside the same PR as PR 2 (d08a5ee078) rather than as a separate PR. PR 4 split into the IMDS-guard simplification commits (1465bd7a61 + e1d5bd22db) and the docs commit (2c49919b10).
SANDBOX_DOCKER_BLOCK_IMDS env was removed. The plan listed it under "Docker Resource Model" as a runtime knob. In practice an app-level Docker bridge address block is unreliable (Docker manages user-defined network routing), so the env var was removed (e1d5bd22db). The only IMDS defense is the host-level DOCKER-USER iptables rule that install.sh --include-craft best-effort installs on EC2.CRAFT_ALLOW_UNBLOCKED_IMDS opt-out was dropped (1465bd7a61). Install no longer hard-fails on EC2 when iptables is unavailable; it logs a clear warning with the manual command and continues. This matches the plan's "Open Decision 4 — Recommendation: guard first" but lands the guard as best-effort rather than blocking.1 CPU / 2 GiB, matching K8s requests (not the K8s 2 CPU / 10 GiB limits), because single-VM compose deployments cannot over-commit every sandbox.background gets the same socket mount and onyx_craft_sandbox network attachment as api_server.api_server and background attach to default plus onyx_craft_sandbox, so the Next.js preview reaches sandboxes by container DNS / IP on the bridge.SANDBOX_API_SERVER_URL is required to be the public HTTPS URL, not a compose hostname — documented in env.template and sandbox/README.md. The agent cannot resolve api_server by DNS from inside the sandbox bridge.test_sandbox_backend_selection.py / test_idle_cleanup_backend_abstraction.py / test_snapshot_manager_streams.py / test_docker_manager_config.py / test_docker_acp_exec_client.py were implemented as part of PR #11218 and d08a5ee078.## Tests → Manual EC2 smoke block remains the checklist.docker-compose-2 and docker-compose-3 PR stack.