documentation/blog/2026-04-29-computer-controller-peekaboo/index.md
Most AI agents live in the terminal. They can read files, run commands, and write code — but ask them to click a button in a web app or fill out a form, and they're stuck. They can't see what's on screen, and they certainly can't interact with it.
With v1.26, goose broke out of the terminal. The Computer Controller extension was rebuilt from the ground up with Peekaboo, a macOS CLI tool for screen capture and GUI automation. This gives goose eyes and hands for the desktop — it can see annotated screenshots, identify UI elements, click buttons, type text, scroll, drag, and navigate menus across any application on your Mac.
<!-- truncate -->The core workflow is dead simple and surprisingly reliable:
B1, B2, T1.Because element IDs are tied to actual UI components (not screen positions), this approach adapts to different window sizes and positions. goose doesn't need to know where a button is on screen — just what it is.
Here's what that looks like in practice:
goose: Let me see what's on screen...
→ see --app Safari --annotate
goose: I can see the form. Clicking on the email field...
→ click --on T3
→ type "[email protected]" --return
The see command returns both an annotated screenshot (that is stored temporarily in memory) and structured JSON data with element IDs, labels, and types. This combination of vision and structure is what makes the interaction reliable.
Once goose can see and interact with your screen, the range of tasks it can handle expands dramatically. Here are some real examples:
"Go to the HR portal and submit my time off request for next Friday"
goose opens the browser, navigates to the page, identifies the form fields, fills them in, and clicks submit — all by seeing the UI and interacting with it step by step.
"Open Figma, find the design called 'Homepage Redesign', and export it as PNG"
Multi-step workflows across apps that don't have APIs become possible. goose can click through menus, search within apps, and follow multi-screen flows.
"Turn on Do Not Disturb and set my display brightness to 50%"
System preferences, menu bar items, and macOS settings are all accessible through Peekaboo's menu, menubar, and dialog commands.
"For each PDF in my Downloads folder, open it in Preview and print it"
goose can combine its existing file system and shell capabilities with GUI automation — reading a directory listing, then opening and interacting with each file visually.
Peekaboo gives goose a comprehensive set of commands for interacting with macOS across vision, interaction, and system control. Check out the Peekaboo docs for a full list. Some highlights:
see — capture an annotated screenshot and get structured UI dataclick — click on an element by its IDtype — type text or press keysscroll — scroll within an elementdrag — click and drag from one element to anothermenu — interact with menu bar itemsdialog — interact with system dialogs and notificationsAfter using the Computer Controller extensively, here are a few things that help:
see command to understand the current UI state before taking action. If you're debugging an interaction, ask goose to take a fresh screenshot.paste instead of type when entering longer content. This avoids issues with special characters and typing speed.The Computer Controller is powerful, but it's worth knowing the boundaries:
see command, though you can target specific screens in multi-monitor setups with --screen-index.The Computer Controller extension is built into goose — just enable it in your extensions and start asking goose to do visual tasks. If you're on macOS, Peekaboo will handle the rest.
In goose Desktop, go to Extensions and toggle on Computer Controller. In the CLI:
goose configure
# → Toggle Extensions → enable computercontroller
Then try something simple:
Take a screenshot of my current screen and describe what you see.
Or something more ambitious:
Open System Settings, go to Displays, and set the resolution to "More Space."
goose will figure out the rest — seeing the UI, identifying the right elements, and clicking through to get it done.
<head> <meta property="og:title" content="Beyond the Terminal: goose Controls Your Desktop with Peekaboo" /> <meta property="og:type" content="article" /> <meta property="og:url" content="https://block.github.io/goose/blog/2026/04/29/computer-controller-peekaboo" /> <meta property="og:description" content="The Computer Controller extension was rebuilt with Peekaboo, giving goose the ability to see, click, type, and interact with any application on your Mac." /> <meta name="twitter:card" content="summary_large_image" /> <meta property="twitter:domain" content="block.github.io/goose" /> <meta name="twitter:title" content="Beyond the Terminal: goose Controls Your Desktop with Peekaboo" /> <meta name="twitter:description" content="The Computer Controller extension was rebuilt with Peekaboo, giving goose the ability to see, click, type, and interact with any application on your Mac." /> </head>