docs/craft/features/shared-acp-exec-client.md
backend/onyx/server/features/build/sandbox/kubernetes/internal/acp_exec_client.py
(783 lines) and
backend/onyx/server/features/build/sandbox/docker/internal/acp_exec_client.py
(603 lines) are ~95% structurally identical. The ACP JSON-RPC protocol code,
state management, send_message loop, session lifecycle, and event
dispatch are byte-for-byte the same between them. The only real
divergence is the transport: K8s uses the kubernetes-client
WebSocket exec stream; Docker uses a raw multiplexed Docker exec socket.
Today, every fix or behavior change to the ACP protocol layer has to be
duplicated across two files. The current PR (docker-compose-2, #11222)
already had to land an SSEKeepalive-move fix and a _recv_exact
timeout fix; both would have been single-file changes against a shared
base. Extracting the shared protocol code into an ACPExecClientBase
abstract class reduces the long-term cost of every future ACP change
and shrinks the codebase by ~700 lines.
This is the natural follow-up to docker-compose-2 and lives on its own
PR (docker-compose-3) so the K8s code motion is reviewed
independently of the Docker functional work in the prior PR.
Duplicated ACP protocol code across K8s and Docker exec clients.
The protocol-level methods (_send_request, _send_notification,
_wait_for_response, _initialize, _create_session,
_list_sessions, _resume_session, _try_resume_existing_session,
resume_or_create_session, send_message,
_process_session_update, cancel, __enter__, __exit__) are
identical between the two clients except for transport details.
Duplicated state management. Both clients independently define
ACPSession, ACPClientState, the ACPEvent union, the
DEFAULT_CLIENT_INFO dict, ACP_PROTOCOL_VERSION, and the
reader-thread + response-queue lifecycle.
Future drift risk. When K8s and Docker fix the ACP packet-loss issue logged in the previous PR (or any future protocol fix), it would have to land in two files without a base class.
backend/onyx/server/features/build/sandbox/acp/base.py.
This is a new top-level subdirectory under sandbox/.SSEKeepalive is already shared in sandbox/base.py from
docker-compose-2's earlier fix. Leave it there; the new
ACPExecClientBase imports it. (Don't move it again — both K8s and
Docker exec clients already re-export it for back-compat with any
external imports.)health_check() method (runs echo ok via a
fresh exec) is K8s-specific and stays on the subclass. Docker has no
equivalent and doesn't need one._get_k8s_client (K8s-only) and _recv_exact / frame parser
(Docker-only) stay on their respective subclasses.logger prefixes differ ([ACP] vs [DOCKER-ACP]) and the
packet_logger context= argument differs ("k8s" vs "docker").
The base class takes a log_prefix and log_context (or wraps them
into a single transport_name field that derives both) to preserve
identical log output on each backend.backend/onyx/server/features/build/sandbox/acp/Two files:
__init__.py (empty).base.py:
ACP_PROTOCOL_VERSION = 1, DEFAULT_CLIENT_INFO (parametrize "name" via subclass override since K8s and Docker today report different client names — verify whether opencode actually uses these; if it ignores them, unify to a single value).ACPSession, ACPClientState.ACPEvent type alias (the schema union).ACPExecClientBase(ABC) with the shared protocol implementation.Five abstract methods (minimum needed to cover the transport divergence):
class ACPExecClientBase(ABC):
transport_name: ClassVar[str] # "k8s" or "docker" — drives log_prefix + packet_logger context
@abstractmethod
def _open_transport(self, cwd: str) -> None: ...
@abstractmethod
def _close_transport(self) -> None: ...
@abstractmethod
def _is_transport_open(self) -> bool: ...
@abstractmethod
def _write_line(self, line: str) -> None:
"""Write one already-newline-terminated JSON-RPC line to the transport."""
@abstractmethod
def _read_responses_loop(self) -> None:
"""Long-running reader. Pulls from the transport, parses JSON lines,
calls self._enqueue_message(msg) for each. Respects self._stop_reader."""
start(cwd, timeout):
self._open_transport(cwd) (subclass-specific).self._stop_reader.self._read_responses_loop.time.sleep(0.5) to let opencode boot.self._initialize(timeout).self.stop() then re-raise.stop():
self._stop_reader.self._close_transport().self._state = ACPClientState()._enqueue_message(msg) (helper for subclass readers):
packet_logger.log_jsonrpc_raw_message("IN", msg, context=self.transport_name).self._response_queue.__enter__ / __exit__ shared.
These move verbatim from either current implementation, with two mechanical substitutions:
self._ws_client.write_stdin(...) and self._socket.sendall(...)
→ self._write_line(...).self._ws_client.is_open() and self._socket is None checks
→ self._is_transport_open().Methods:
_get_next_id_send_request_send_notification_wait_for_response_initialize_create_session_list_sessions_resume_session_try_resume_existing_sessionresume_or_create_sessionsend_message_process_session_update_send_error_responsecancelis_running (returns self._is_transport_open())Log prefixes use self.transport_name: [%s-ACP] uppercased, so K8s
sees [K8S-ACP] and Docker sees [DOCKER-ACP]. (Small visual change
from [ACP] → [K8S-ACP] — worth flagging in the PR description.)
kubernetes/internal/acp_exec_client.py)Drops to ~150 lines. Keeps:
__init__ taking pod_name, namespace, container,
client_info, client_capabilities. Calls super().__init__(...)._get_k8s_client (lazy K8s API client)._open_transport: builds the XDG_DATA_HOME=... exec opencode acp --cwd ... command, calls k8s_stream(connect_get_namespaced_pod_exec, ...),
stores self._ws_client._close_transport: self._ws_client.close(), null it._is_transport_open: self._ws_client is not None and self._ws_client.is_open()._write_line: self._ws_client.write_stdin(line)._read_responses_loop: the existing K8s reader (uses
ws_client.update/read_stdout/read_stderr), with the body of the
inner try block replaced by self._enqueue_message(message).health_check (K8s-only, kept).transport_name = "k8s".docker/internal/acp_exec_client.py)Drops to ~120 lines. Keeps:
__init__ taking docker_client, container_name, user,
client_info, client_capabilities. Calls super().__init__(...)._open_transport: exec_create + exec_start(socket=True) →
_unwrap_socket → set 0.5s socket timeout. Store self._socket._close_transport: shutdown(RDWR) + close, null the socket._is_transport_open: self._socket is not None._write_line: self._socket.sendall(line.encode("utf-8")) with the
existing _socket_lock._read_responses_loop: the existing Docker reader (frame parser via
_recv_exact), with the body replaced by self._enqueue_message(message)._recv_exact (kept — Docker-specific).transport_name = "docker".The currently-shared _FRAME_HEADER_BYTES, _FRAME_STDOUT, _FRAME_STDERR
constants are already re-exported from exec_helpers.py; nothing to
move.
backend/onyx/server/features/build/sandbox/acp/__init__.py
backend/onyx/server/features/build/sandbox/acp/base.pybackend/onyx/server/features/build/sandbox/kubernetes/internal/acp_exec_client.py
backend/onyx/server/features/build/sandbox/docker/internal/acp_exec_client.pybackend/onyx/server/features/build/session/manager.py (imports
SSEKeepalive from sandbox.base — unchanged).
backend/onyx/server/features/build/sandbox/docker/docker_sandbox_manager.py
(imports DockerACPExecClient, ACPEvent — both still exported from
the same module).
backend/onyx/server/features/build/sandbox/kubernetes/kubernetes_sandbox_manager.py
(imports ACPExecClient, ACPEvent — both still exported).This is pure code motion — no new behavior to test. Verification is:
backend/tests/unit/onyx/server/features/build/sandbox/test_docker_acp_exec_client.py
(which exercises Docker's start + initialize round-trip via a fake
framed socket and asserts is_running flips correctly on stop).backend/tests/unit/onyx/server/features/build/sandbox/ to confirm
no K8s-specific assertions regress.backend/scripts/manual_test_docker_sandbox.py to confirm start,
initialize, and stop lifecycle still work end-to-end against a
real Docker daemon. We don't have an equivalent K8s smoke script —
rely on code review for the K8s side.No new tests required. If desired, a small unit test for the base
class's _send_request / _wait_for_response against a fake
transport could be added, but it would duplicate the framing test that
test_docker_acp_exec_client.py already provides.
docker-compose-2 via
ez create docker-compose-3 --from docker-compose-2.refactor(craft): shared ACPExecClient base across K8s + Docker.Landed as d94321e924 refactor(craft): shared ACPExecClient base across K8s + Docker on docker-compose-3 (PR #11225, open).
| File | Plan estimate | Actual |
|---|---|---|
sandbox/acp/base.py (new) | — | 639 |
kubernetes/internal/acp_exec_client.py | ~150 | 220 |
docker/internal/acp_exec_client.py | ~120 | 237 |
| Total across 3 files | ~870 | 1096 |
| Total before refactor (2 files) | 1386 | 1386 |
| Net reduction | ~510 | ~290 |
The reduction was smaller than the plan estimated because more transport-adjacent helpers (frame parsing context, packet logging, reader-loop scaffolding) stayed on the subclasses than expected. Still a meaningful win — every future ACP protocol fix is now a one-file change.
SSEKeepalive stayed in sandbox/base.py as the plan required — no second migration.[ACP] → [K8S-ACP] is live on the K8s path. Worth flagging in any review of K8s log diffs.test_docker_acp_exec_client.py + ty/ruff + manual Docker smoke. Manual K8s smoke was not run; review-only verification.docker-compose-3 (#11225) once docker-compose-2 (#11222) lands.