packages/builtin-skills/src/verify/references/computer-use.md
agent-browser drives Chromium surfaces (web, Electron). For everything it can't
reach, this is the escape hatch: macOS Computer Use via osascript
(AppleScript) and screencapture. Use it to drive native apps, handle OS-level
chrome, and read the screen when no CDP target exists.
This is the native/OS counterpart of agent-browser.md. It is macOS-only and not cloud-portable — prefer CDP automation when both can reach the target; reach here only when CDP can't.
pbpaste.osascript -e 'tell application "AppName" to activate'
osascript -e 'tell application "System Events" to keystroke "Hello world"'
osascript -e 'tell application "System Events" to key code 36' # Enter (hardware code, layout-independent)
osascript -e 'tell application "System Events" to key code 48' # Tab
osascript -e 'tell application "System Events" to key code 53' # Escape
osascript -e 'set the clipboard to "Your long or 中文 / emoji message"'
osascript -e 'tell application "System Events" to keystroke "v" using command down'
osascript -e 'tell application "System Events" to keystroke "f" using command down' # Cmd+F
osascript -e 'tell application "System Events" to keystroke "k" using {command down, shift down}' # Cmd+Shift+K
osascript -e 'tell application "System Events" to click at {500, 300}'
osascript -e '
tell application "System Events" to tell process "AppName"
get {position, size} of window 1
end tell'
screencapture /tmp/shot.png # full screen
screencapture -i /tmp/shot.png # interactive region select
screencapture -l "$WINDOW_ID" /tmp/shot.png # specific window
Get a window id:
osascript -e 'tell application "System Events" to tell process "AppName" to get id of window 1'
osascript -e '
tell application "System Events" to tell process "AppName"
get value of text field 1 of window 1
end tell'
entire contents of window 1dumps the whole UI tree but is extremely slow on complex apps — prefer a screenshot + visual read, or the clipboard read below.
osascript -e '
tell application "System Events"
keystroke "a" using command down
keystroke "c" using command down
end tell'
sleep 0.5
pbpaste
screencapture PNG → --type screenshot --by cli.pbpaste → --type text --content "$(pbpaste)".System Events call silently fails.keystroke is slow and mangles non-ASCII — use clipboard paste (Cmd+V) for
anything long or for Chinese / emoji / special characters.key code 36 is Enter by hardware code, so it works regardless of keyboard
layout. Some apps send-on-Enter — use Shift+Enter for a newline within input.delay/sleep between actions — native apps need time to process
UI events; back-to-back commands drop.微信 vs WeChat; handle the name the
running OS uses.