Back to Agent Browser

Installation

docs/src/app/installation/page.mdx

0.26.06.0 KB
Original Source

Installation

Installs the native Rust binary for maximum performance:

bash
npm install -g agent-browser
agent-browser install  # Download Chrome from Chrome for Testing (first time)

This is the fastest option -- commands run through the native Rust CLI directly with sub-millisecond parsing overhead.

Quick start (no install)

bash
npx agent-browser install   # Download Chrome (first time only)
npx agent-browser open example.com

Project installation (local dependency)

For projects that want to pin the version in package.json:

bash
npm install agent-browser
npx agent-browser install  # Download Chrome (first time)

Then use via npx or package.json scripts.

Homebrew (macOS)

bash
brew install agent-browser
agent-browser install  # Download Chrome (first time)

Cargo (Rust)

bash
cargo install agent-browser
agent-browser install  # Download Chrome (first time)

Compiles from source (~2-3 min). Requires the Rust toolchain (rustup.rs).

From source

bash
git clone https://github.com/vercel-labs/agent-browser
cd agent-browser
pnpm install
pnpm build
pnpm build:native
./bin/agent-browser install
pnpm link --global

Linux dependencies

On Linux, install system dependencies:

bash
agent-browser install --with-deps

Updating

Upgrade to the latest version:

bash
agent-browser upgrade

Detects your installation method (npm, Homebrew, or Cargo) and runs the appropriate update command automatically. Displays the version change on success, or informs you if you are already on the latest version.

Doctor

doctor diagnoses your install and auto-cleans stale daemon files. Run it whenever something stops working unexpectedly, or after upgrades:

bash
agent-browser doctor                     # Full diagnosis
agent-browser doctor --offline --quick   # Local-only, fastest (~<1s)
agent-browser doctor --fix               # Also run destructive repairs
agent-browser doctor --json              # Structured output

It checks:

<table> <thead> <tr><th>Category</th><th>What it checks</th></tr> </thead> <tbody> <tr><td>Environment</td><td>CLI version, platform, home directory, state and socket dirs, free disk space</td></tr> <tr><td>Chrome</td><td>Chrome install path and version, cache dir, Puppeteer fallback, user-data dir and profile count, optional <code>lightpanda</code> engine</td></tr> <tr><td>Daemons</td><td>Running daemons per session, stale <code>.sock</code> / <code>.pid</code> / <code>.version</code> / <code>.stream</code> files (auto-cleaned), version mismatch with the CLI, dashboard process liveness</td></tr> <tr><td>Config</td><td><code>~/.agent-browser/config.json</code>, <code>./agent-browser.json</code>, and any file at <code>AGENT_BROWSER_CONFIG</code> parse as valid JSON</td></tr> <tr><td>Security</td><td>Encryption key env var or <code>~/.agent-browser/.encryption-key</code> (with 0600 permissions on unix), state file count and age vs <code>AGENT_BROWSER_STATE_EXPIRE_DAYS</code>, action policy file</td></tr> <tr><td>Providers</td><td>Env vars for Browserless, Browserbase, Browser Use, Kernel, AgentCore (AWS creds), Appium (for <code>--provider ios</code>), and <code>AI_GATEWAY_API_KEY</code> for chat</td></tr> <tr><td>Network</td><td>Reachability of the Chrome for Testing CDN, AI Gateway (if configured), and any currently selected provider endpoint (skipped under <code>--offline</code>)</td></tr> <tr><td>Launch test</td><td>Spawns a scratch session, launches headless Chrome, navigates to <code>about:blank</code>, then closes. Measures wall time (skipped under <code>--quick</code>)</td></tr> </tbody> </table>

Stale sidecar files are always cleaned. Destructive actions are opt-in via --fix:

<table> <thead> <tr><th>Check</th><th>What <code>--fix</code> does</th></tr> </thead> <tbody> <tr><td>Chrome missing</td><td>Runs <code>agent-browser install</code></td></tr> <tr><td>Version-mismatched daemons</td><td>Sends <code>close</code> to each and cleans files</td></tr> <tr><td>Old state files</td><td>Deletes state files older than <code>AGENT_BROWSER_STATE_EXPIRE_DAYS</code> (default 30)</td></tr> <tr><td>Missing encryption key</td><td>Generates a new key at <code>~/.agent-browser/.encryption-key</code> (0600, unix); never overwrites an existing key</td></tr> </tbody> </table>

Exit code is 0 if all checks pass (warnings are fine), 1 if any fail.

Custom browser

Use a custom browser executable instead of bundled Chromium:

  • Serverless - Use @sparticuz/chromium (~50MB vs ~684MB)
  • System browser - Use existing Chrome installation
  • Custom builds - Use modified browser builds
bash
# Via flag
agent-browser --executable-path /path/to/chromium open example.com

# Via environment variable
AGENT_BROWSER_EXECUTABLE_PATH=/path/to/chromium agent-browser open example.com

Serverless example

Use @sparticuz/chromium or similar to obtain a Chromium executable path, then pass it via --executable-path or AGENT_BROWSER_EXECUTABLE_PATH.

AI agent setup

agent-browser works with any AI agent out of the box. For richer context:

Install the skill for your AI coding assistant:

bash
npx skills add vercel-labs/agent-browser

This works with Claude Code, Codex, Cursor, Gemini CLI, GitHub Copilot, Goose, OpenCode, and Windsurf. The skill is fetched from the repository and stays up to date automatically.

Do not copy SKILL.md from node_modules -- it will become stale as new features are added. Always use npx skills add or reference the repository version.

AGENTS.md / CLAUDE.md

Add to your instructions file:

markdown
## Browser Automation

Use `agent-browser` for web automation. Run `agent-browser --help` for all commands.

Core workflow:
1. `agent-browser open <url>` - Navigate to page
2. `agent-browser snapshot -i` - Get interactive elements with refs (@e1, @e2)
3. `agent-browser click @e1` / `fill @e2 "text"` - Interact using refs
4. Re-snapshot after page changes