plugins/_a0_connector/skills/host-computer-use/SKILL.md
This skill unlocks the beta computer_use_remote tool for connected local desktop control through A0 CLI.
Load this skill before using computer_use_remote for local desktop and native UI tasks on the connected machine. Use it for the user's real host screen, not the internal Agent Zero Desktop.
If the task is browser-only and the user is flexible, prefer direct browser tooling because it is usually more reliable and token-efficient than screenshot-driven desktop control.
If the task needs shell execution on the CLI host, load host-code-execution separately rather than treating desktop control and shell execution as one affordance.
This skill controls the user's connected host/local computer through A0 CLI. It is not the built-in Linux Desktop/Xpra skill.
Never switch to linux-desktop, the Agent Zero Desktop/Xpra surface, desktopctl.sh, code_execution_tool, or Docker/server shell commands as a fallback for host screen actions such as screenshots, clicking, typing, desktop state changes, or checking visible host UI. Those paths only see the internal Agent Zero runtime. If computer_use_remote is unavailable, disabled, or needs re-arming, stop and ask the user to run /computer-use on in the A0 CLI and approve the platform permission prompt.
browser tool. The Browser plugin chooses Docker or A0 CLI host-browser runtime from Browser settings and can surface Chrome remote-debugging setup.computer_use_remote for web-page navigation just because the phrase "host browser" appears. Use this skill only for desktop/browser-chrome tasks that the browser tool cannot express.chrome://inspect/#remote-debugging, enable "Allow remote debugging for this browser instance", run /browser host on, and retry.code_execution_remote, xdg-open, sensible-browser, or Python webbrowser.open for host-browser control. Those can launch pages without giving Agent Zero browser control or setup diagnostics.Use:
{
"tool_name": "computer_use_remote",
"tool_args": {
"action": "start_session"
}
}
Arguments:
action: start_session, status, capture, move, click, scroll, key, type, stop_sessionsession_id: optional after start_sessionmove: x, y normalized to [0,1]click: optional x, y, optional button (left, right, middle), optional countscroll: dx, dykey: key or keystype: text, optional submit booleanAvailability, backend support, and trust mode are checked when the tool runs. If no CLI is connected or local computer use is disabled, tell the user what to enable instead of using the server environment.
If any tool result contains COMPUTER_USE_REARM_REQUIRED or status=rearm required, stop the computer-use sequence immediately. Do not retry start_session, do not call capture, and do not use shell, vision, or screenshot fallbacks to bypass it. Tell the user that the A0 CLI has Computer Use configured but the installed desktop-control backend is not armed; they should run /computer-use on in the A0 CLI and approve the platform permission prompt if shown.
start_session first.backend_id, backend_family, and features; load a backend-specific Computer Use skill when the task needs backend-only affordances.status for state without starting a session.capture only when you need another screenshot without taking an action.atspi-tree-snapshot / atspi-structural-targeting, load host-computer-use-linux before using Linux AT-SPI structural actions.accessibility-tree-snapshot / accessibility-structural-targeting, load host-computer-use-macos before using macOS structural Accessibility actions.uia-tree-snapshot / uia-structural-targeting, load host-computer-use-windows before using Windows UI Automation structural actions.host-computer-use-linux; do not apply macOS AX-specific assumptions unless the backend is macOS.key and type over pointer actions whenever a reliable keyboard path exists.type tool result only confirms keystrokes were sent. It is not evidence that the text landed in the intended application.capture to verify before repeating the same action.type(..., submit=true) only for URL or navigation-style entry where Enter should fire immediately after typing.submit=true for ordinary text fields. Type first, then send enter separately if needed.page_down, page_up, space, shift+space, arrows, home, or end.scroll when the desired pane is already active or keyboard scrolling cannot target it.move and click as last-resort actions for controls that cannot be reached through backend-specific structural targeting, keyboard, browser, or app-native tooling.stop, pause, abort, hold, don't continue, or equivalent, halt immediately and do not use computer-use tools again until the user explicitly resumes.