packages/desktop/apps/electron/resources/docs/browser-tools.md
Use browser_tool to control built-in browser windows (Chromium) inside Qwen Code.
Quick start: Run
browser_tool --helpto see all available commands and usage examples.
browser_toolbun run browser-tool --help for command discovery/templates, and bun run browser-tool parse-url <url> for safe URL diagnostics outside agent turnsUse browser workflows when creating a source would add unnecessary overhead for the current task.
Good fit for browser-first:
Still prefer sources when:
If you're unsure which window to use, run:
browser_tool({ command: "windows" })
Recommended flow:
open — ensure browser window exists (background by default)navigate <url> — load a URLsnapshot — inspect accessible elements and get refs (@e1, @e2, ...)find <query> — quickly narrow to matching refs by keywordclick / fill / select — interact using refsscreenshot --annotated (or screenshot-region) — visual verification when neededbrowser_tool command examplesbrowser_tool({ command: "--help" })
browser_tool({ command: "open" })
browser_tool({ command: "open --foreground" })
browser_tool({ command: "navigate https://example.com" })
browser_tool({ command: "snapshot" })
browser_tool({ command: "find login button" })
browser_tool({ command: "click @e12" })
browser_tool({ command: "click-at 350 200" })
browser_tool({ command: "drag 100 200 300 400" })
browser_tool({ command: "fill @e5 [email protected]" })
browser_tool({ command: "type Hello World" })
browser_tool({ command: "select @e3 optionValue" })
browser_tool({ command: "select @e75 CNAME --assert-text Target --timeout 3000" })
browser_tool({ command: "upload @e3 /absolute/path/to/file.pdf" })
browser_tool({ command: "set-clipboard Name\tAge\nAlice\t30" })
browser_tool({ command: "get-clipboard" })
browser_tool({ command: "paste Name\tAge\nAlice\t30" })
browser_tool({ command: "scroll down 800" })
browser_tool({ command: "evaluate document.title" })
browser_tool({ command: "console 50 warn" })
browser_tool({ command: "screenshot" })
browser_tool({ command: "screenshot --annotated" })
browser_tool({ command: "screenshot-region --ref @e12 --padding 8" })
browser_tool({ command: "window-resize 1280 720" })
browser_tool({ command: "network 50 failed" })
browser_tool({ command: "wait network-idle 8000" })
browser_tool({ command: "key Enter" })
browser_tool({ command: "downloads wait 15000" })
browser_tool({ command: "focus" })
browser_tool({ command: "windows" })
browser_tool({ command: "release" })
browser_tool({ command: "hide" })
browser_tool({ command: "close" })
The wrapper validates commands and returns actionable errors when arguments are missing or invalid.
It also returns rich execution feedback for most commands, including before/after state where available (scroll positions, active element, URL/title transitions, resize clamping, request/error summaries, and window ownership/visibility details).
You can batch commands with semicolons, for example:
fill @e1 [email protected]; fill @e2 password123; click @e3
Batches run left-to-right and stop automatically after navigation commands (navigate, click, back, forward) so refs don’t go stale silently.
browser_tool supports quoted arguments:
fill @e5 "Hello world"wait text 'welcome back' 5000Semicolons inside quotes are treated as literal text (not batch separators):
fill @e1 "a;b;c"; click @e2screenshot-region --selector "div[data-x='a;b']" --padding 8Use backslash escaping when needed:
\; for a literal semicolon outside quotes\" for a literal " inside double-quoted textopen [--foreground|-f]Create or reuse the session browser window.
--foreground / -f: focuses in foregroundsnapshotReturns an accessibility tree with refs and element metadata.
find <query>Performs keyword search over the snapshot accessibility nodes (role, name, value, description) and returns matching refs.
click <ref> [waitFor] [timeoutMs]Click an element ref from snapshot. Optional wait modes: none, navigation, network-idle.
click-at <x> <y>Click at raw pixel coordinates. Use this for canvas-based UIs (e.g., Google Sheets cells, map elements, chart data points) where snapshot can't produce element refs. Get coordinates from screenshot or screenshot-region.
drag <x1> <y1> <x2> <y2>Drag from pixel coordinates (x1, y1) to (x2, y2). Performs mousedown, interpolated mousemove events, and mouseup. Use this for:
Get coordinates from screenshot or screenshot --annotated.
fill <ref> <value> / select <ref> <value> [--assert-text <text>] [--assert-value <value>] [--timeout <ms>]Fill text inputs or select dropdown values. Requires an element ref from snapshot.
For modern React/portal combobox UIs, select now performs additional verification and may return a warning when interaction succeeds but form state does not appear to mutate.
Useful flags:
--assert-text <text>: verify downstream UI mutation (for example field label changes to Target)--assert-value <value>: verify selected control reflects expected value--timeout <ms>: verification timeout (default 2000ms)upload <ref> <path> [path2...]Attach local file(s) to a file input (<input type="file">) using a ref from snapshot.
Notes:
upload @e3 /path/a.pdf /path/b.jpgtype <text>Type text character-by-character into the currently focused element without needing a ref. Use this when:
click or click-atDifference from fill: fill focuses a ref and replaces its value. type sends keystrokes to whatever is currently focused.
set-clipboard <text> / get-clipboardRead or write the page clipboard programmatically.
set-clipboard writes text and interprets common escape sequences:
\t → tab\n → newline\r → carriage return\\ → literal backslash\\x stays \\x)get-clipboard reads the current clipboard text content as raw text (tabs/newlines are returned as actual characters)paste <text>Convenience command: writes text to clipboard then triggers Ctrl+V (or Cmd+V on Mac). Equivalent to set-clipboard <text> followed by key v meta/key v control. Escape handling is identical to set-clipboard, which makes TSV-style bulk data entry reliable.
screenshot / screenshot --annotated / screenshot-region ...Capture full-window or targeted screenshots. --annotated overlays @eN labels on interactive elements for easier ref debugging.
console, network, wait, downloadsDebug runtime issues, requests, synchronization points, and download progress.
downloads output includes the resolved local savePath when available so you can reference the downloaded file directly.
focus [windowId] / windowsManage and inspect browser window ownership and visibility.
release — dismiss agent overlay, keep window visible for userhide — hide window but preserve session stateclose — close and destroy windowMissing command... → pass a command string (try --help)Unknown browser_tool command ... → typo/unsupported verb; check help...requires ... → required argument is missing for that command...must be numbers → numeric argument parse failedbrowser-tool parse-urlUse this for safe URL debugging in Explore mode without running a generic interpreter snippet:
bun run browser-tool parse-url https://example.com/path?q=1#hash
bun run browser-tool parse-url file:///Users/me/Desktop/report.html
Output is deterministic JSON (href, protocol, host, hostname, pathname, search, hash, origin, plus basename for file:// URLs).
~/.craft-agent/docs/browser-tools.md).browser_tool close for explicit teardown.Canvas-based web apps (Google Sheets, Google Docs, some map/chart UIs) render content as pixels on <canvas> — individual cells or elements are not DOM nodes and won't appear in snapshot. Use these patterns instead:
# 1. Navigate and wait for load
navigate https://docs.google.com/spreadsheets/d/{id}/edit
wait selector [aria-label="Name Box"] 10000
# 2. Navigate to a cell via Name Box (a DOM element — snapshot finds it)
snapshot
click @nameBoxRef
type A1
key Enter
# 3. Edit a cell
key F2
type Hello World
key Enter
# 4. Bulk write via TSV clipboard paste
snapshot
click @nameBoxRef
type A1
key Enter
paste Name\tAge\tCity\nAlice\t30\tNYC\nBob\t25\tLA
# 5. Read data via clipboard
key a meta # Select all (Cmd+A)
key c meta # Copy (Cmd+C)
get-clipboard # Returns TSV string
# 6. Click a canvas cell by coordinates (from screenshot)
click-at 350 200
# 7. Move a chart by dragging (coordinates from screenshot)
drag 400 300 100 50
# 8. Read data via export URL (no editing needed)
navigate https://docs.google.com/spreadsheets/d/{id}/export?format=csv&gid=0
snapshot can find themclick-at or keyboard navigation, not clickdrag to reposition elements on the canvaskey for navigationpaste with tab-separated valuesThe desktop browser manager isn’t wired for this runtime/session. Ensure you’re in the Electron desktop app and session is initialized.
Refs are stale. Re-run snapshot and use fresh refs.
Wait for page readiness and retry using:
open → snapshot → interaction