docs/book/src/tools/browser.md
This guide covers setting up browser automation capabilities in ZeroClaw, including both headless automation and GUI access via VNC.
ZeroClaw supports multiple browser access methods:
| Method | Use Case | Requirements |
|---|---|---|
| agent-browser CLI | Headless automation, AI agents | npm, Chrome |
| VNC + noVNC | GUI access, debugging | Xvfb, x11vnc, noVNC |
| Chrome Remote Desktop | Remote GUI via Google | XFCE, Google account |
# Install CLI
npm install -g agent-browser
# Download Chrome for Testing
agent-browser install --with-deps # Linux (includes system deps)
agent-browser install # macOS/Windows
The browser tool is enabled by default with allowed_domains = ["*"]. Restrict domains or disable it via zeroclaw config set:
zeroclaw config set browser.allowed-domains '["example.com", "docs.example.com"]'
zeroclaw config set browser.enabled false
See the Config reference for all browser fields and defaults.
echo "Open https://example.com and tell me what it says" | zeroclaw agent
For debugging or when you need visual browser access:
# Ubuntu/Debian
apt-get install -y xvfb x11vnc fluxbox novnc websockify
# Optional: Desktop environment for Chrome Remote Desktop
apt-get install -y xfce4 xfce4-goodies
#!/bin/bash
# Start virtual display with VNC access
DISPLAY_NUM=99
VNC_PORT=5900
NOVNC_PORT=6080
RESOLUTION=1920x1080x24
# Start Xvfb
Xvfb :$DISPLAY_NUM -screen 0 $RESOLUTION -ac &
sleep 1
# Start window manager
fluxbox -display :$DISPLAY_NUM &
sleep 1
# Start x11vnc
x11vnc -display :$DISPLAY_NUM -rfbport $VNC_PORT -forever -shared -nopw -bg
sleep 1
# Start noVNC (web-based VNC)
websockify --web=/usr/share/novnc $NOVNC_PORT localhost:$VNC_PORT &
echo "VNC available at:"
echo " VNC Client: localhost:$VNC_PORT"
echo " Web Browser: http://localhost:$NOVNC_PORT/vnc.html"
localhost:5900http://localhost:6080/vnc.htmlDISPLAY=:99 google-chrome --no-sandbox https://example.com &
# Download and install
wget https://dl.google.com/linux/direct/chrome-remote-desktop_current_amd64.deb
apt-get install -y ./chrome-remote-desktop_current_amd64.deb
# Configure session
echo "xfce4-session" > ~/.chrome-remote-desktop-session
chmod +x ~/.chrome-remote-desktop-session
systemctl --user start chrome-remote-desktopGo to https://remotedesktop.google.com/access from any device.
# Basic open and close
agent-browser open https://example.com
agent-browser get title
agent-browser close
# Snapshot with refs
agent-browser open https://example.com
agent-browser snapshot -i
agent-browser close
# Screenshot
agent-browser open https://example.com
agent-browser screenshot /tmp/test.png
agent-browser close
# Content extraction
echo "Open https://example.com and summarize it" | zeroclaw agent
# Navigation
echo "Go to https://github.com/trending and list the top 3 repos" | zeroclaw agent
# Form interaction
echo "Go to Wikipedia, search for 'Rust programming language', and summarize" | zeroclaw agent
The page may not be fully loaded. Add a wait:
agent-browser open https://slow-site.com
agent-browser wait --load networkidle
agent-browser snapshot -i
Handle cookie consent first:
agent-browser open https://site-with-cookies.com
agent-browser snapshot -i
agent-browser click @accept_cookies # Click the accept button
agent-browser snapshot -i # Now get the actual content
If web_fetch fails inside Docker sandbox, use agent-browser instead:
# Instead of web_fetch, use:
agent-browser open https://example.com
agent-browser get text body
agent-browser runs Chrome in headless mode with sandboxing--session-name to persist auth state--allowed-domains config restricts navigation to specific domains