plugins/_a0_connector/skills/host-computer-use-windows/SKILL.md
Use this after host-computer-use when the connected A0 CLI reports the Windows computer-use backend.
Do not use this skill for Linux, macOS, Xpra, Docker, or browser-only tasks. If the backend is not Windows or does not advertise UI Automation support, skip UIA actions and follow the generic host computer-use rules.
Windows backends can advertise structural UI Automation features:
uia-tree-snapshotuia-structural-targetinguia-element-actionuia-window-managementWhen these features are present, prefer structural targeting over pixel clicks for named controls such as buttons, menu items, text fields, dialogs, toolbar items, browser address bars, composer fields, and list rows.
Use uia_snapshot to inspect the bounded Windows UI Automation tree:
{
"tool_name": "computer_use_remote",
"tool_args": {
"action": "uia_snapshot",
"max_depth": 4,
"max_nodes": 200
}
}
The snapshot returns element paths, roles, names/titles, automation IDs, class names, optional Terminator-style selectors, frames, enabled/focused state, actions, and child nodes. Use it to choose an element, not as final visual proof.
Use uia_action for a structural action:
{
"tool_name": "computer_use_remote",
"tool_args": {
"action": "uia_action",
"target": {
"role": "Button",
"title": "OK"
},
"operation": "invoke"
}
}
Supported operations are:
invoke: activate a button, menu item, checkbox, or similar controlfocus_window: restore and bring the owning top-level window to the foregroundminimize, restore, maximize: change the owning top-level window state without clicking titlebar buttonsfocus: focus a text field or focusable element after activating its windowset_value: set text/value; pass value or textclick: click the element through the Windows UIA wrapper only when the snapshot says click is the available action and no structural operation fitsclose: close the owning top-level window only when the user explicitly asked to close that app/windowTargeting options:
target when the control has a stable role plus title/name, automation ID, class name, handle, process ID, framework ID, or selector.path returned by the latest uia_snapshot only while the UI is unchanged.selector or target.selector from the snapshot, such as role:Button && name:OK.Action selection:
invoke, use invoke, not click.focus_window, minimize, restore, or maximize; do not click titlebar buttons.set_value on the target field. A global type result only proves keys were sent, not that they landed in the intended control.uia_snapshot before reusing a path.click only after structural UIA, keyboard, browser, and app-native options do not fit, and only from a fresh screenshot with an unambiguous target.UIA actions are attempts, not proof. They attach a fresh screenshot after state-changing actions; inspect that image before saying the requested outcome happened.
Windows desktop capture and UI Automation depend on the interactive desktop session where A0 CLI is running. Remote Desktop, VM consoles, UAC prompts, elevated apps, locked screens, minimized/disconnected RDP sessions, and services can prevent capture or UIA access.
If computer_use_remote returns COMPUTER_USE_REARM_REQUIRED, COMPUTER_USE_APPROVAL_REQUIRED, COMPUTER_USE_CAPTURE_UNAVAILABLE, COMPUTER_USE_UIA_UNAVAILABLE, or status=rearm required, stop immediately and ask the user to re-arm or fix the Windows desktop session. Do not bypass a permission/session failure with server screenshots, Docker commands, linux-desktop, or browser fallbacks.
Windows captures can use a virtual desktop that includes multiple monitors and negative origins. Use the capture/session origin_x, origin_y, width, and height as the coordinate space, and use normalized [0,1] coordinates only relative to that virtual screen.