plugins/_a0_connector/skills/host-computer-use-linux/SKILL.md
Use this after host-computer-use when the connected A0 CLI reports the Linux/Wayland computer-use backend.
Do not use this skill for macOS, Windows, Xpra, Docker, browser-only tasks, or the internal Agent Zero Desktop. If the backend is not Linux or does not advertise AT-SPI support, skip Linux structural actions and follow the generic host computer-use rules.
Linux backends can advertise structural AT-SPI features:
atspi-tree-snapshotatspi-structural-targetingatspi-element-actionatspi-set-valueWhen these features are present, prefer structural targeting over pixel clicks for named buttons, menu items, text fields, dialogs, toolbar items, tab strips, and application windows.
Use ax_snapshot to inspect the Linux AT-SPI tree:
{
"tool_name": "computer_use_remote",
"tool_args": {
"action": "ax_snapshot",
"max_depth": 4,
"max_nodes": 200
}
}
The snapshot returns paths, roles, names/titles, descriptions, frames, states, actions, text previews, values, and child nodes. Use it to choose a target, not as final visual proof.
Use ax_action for structural actions:
{
"tool_name": "computer_use_remote",
"tool_args": {
"action": "ax_action",
"target": {
"role": "push button",
"title": "OK"
},
"operation": "press"
}
}
Supported operations are:
press: activate a button, menu item, tab, checkbox, or similar action-bearing nodefocus: focus a focusable node before typing or keyboard inputset_value: set text/value on editable nodes; pass value or textTargeting options:
target when a node has a stable role plus title/name/description/text/state/action.path returned by the latest ax_snapshot only while the UI is unchanged.Use screenshots for proof after every state-changing action. AT-SPI actions and keyboard events are attempts, not proof, and Wayland focus can reject or redirect input when the active window changes.
On GNOME/Wayland, useful shortcuts include:
Super+H: hide the active windowAlt+Tab: switch applicationsCtrl+L: focus a browser address bar when the browser is already focusedCtrl+T: open a new browser tab when the browser is already focusedTreat every shortcut as an attempt. Inspect the fresh screenshot before saying it worked. If text lands in the wrong app, stop and reassess from capture or ax_snapshot; do not continue typing from assumed focus.
Some apps expose shallow AT-SPI trees unless their own accessibility support is enabled. If the AT-SPI tree is too shallow for a task, fall back in this order: app-native/browser tooling, reliable keyboard paths, then normalized coordinate clicks from a fresh screenshot.
If computer_use_remote returns COMPUTER_USE_AX_UNAVAILABLE, COMPUTER_USE_REARM_REQUIRED, COMPUTER_USE_APPROVAL_REQUIRED, or status=rearm required, stop immediately and ask the user to re-arm or fix the Linux desktop accessibility/session state.
Do not bypass a permission or host-visibility failure with server screenshots, Docker commands, the built-in Linux Desktop/Xpra skill, or code_execution_tool.