docs/src/app/security/page.mdx
agent-browser includes security features to protect against credential exposure, prompt injection via untrusted page content, and unauthorized browser actions.
All security features are opt-in. By default, agent-browser imposes no restrictions on navigation, actions, or output. Enable these features as needed for your deployment -- existing workflows are unaffected until you explicitly activate a feature.
These features are designed to mitigate the following threats when an LLM-based agent drives a browser:
--content-boundaries) let the orchestrator distinguish trusted tool output from untrusted page content.--allowed-domains) blocks navigations, sub-resource requests, WebSocket connections, EventSource streams, and sendBeacon calls to non-allowed domains.--action-policy) and confirmation gating (--confirm-actions) prevent the agent from performing dangerous operations (eval, downloads, uploads) without explicit approval.--max-output) caps the size of page-sourced content.eval action category is allowed, page scripts could theoretically restore the original constructors. Deny eval via --action-policy for maximum protection.about:blank after the filter is active, but resources loaded before that point are not retroactively blocked.--confirm-interactive is set but stdin is not a terminal (e.g., piped input), actions are automatically denied to prevent accidental approval in non-interactive contexts.Store credentials locally and reference them by name. The LLM never sees passwords.
# Save credentials (encrypted if AGENT_BROWSER_ENCRYPTION_KEY is set)
# Recommended: pipe password via stdin to avoid shell history / process listing exposure
echo "pass" | agent-browser auth save github --url https://github.com/login --username user --password-stdin
# Or pass directly (a warning will be shown)
agent-browser auth save github --url https://github.com/login --username user --password pass
# Login using saved credentials
agent-browser auth login github
# List saved profiles (names and URLs only, no secrets)
agent-browser auth list
# Show profile metadata
agent-browser auth show github
# Delete a profile
agent-browser auth delete github
auth login navigates with the load lifecycle event and then waits for form selectors to appear before filling/clicking. This makes delayed SPA login pages more reliable while avoiding networkidle hangs on pages with long-lived background requests.
Custom selectors can be specified if auto-detection fails:
agent-browser auth save myapp \
--url https://app.example.com/login \
--username user --password pass \
--username-selector "#email" \
--password-selector "#password" \
--submit-selector "button.login"
Profiles are stored in ~/.agent-browser/auth/ and always encrypted with AES-256-GCM. If AGENT_BROWSER_ENCRYPTION_KEY is not set, a key is auto-generated at ~/.agent-browser/.encryption-key on first use. Back up this file or set the environment variable explicitly for portability.
File permissions are enforced on both Unix (chmod 600/700) and Windows (icacls restricted to the current user) to prevent other users from reading encryption keys or auth profiles.
Plugins run out-of-process over the agent-browser.plugin.v1 stdio JSON protocol. Configure them in agent-browser.json:
See Plugins for the plugin author protocol and implementation examples.
Use agent-browser plugin add <ref> to create plugin config automatically.
{
"plugins": [
{
"name": "vault",
"command": "agent-browser-plugin-vault",
"capabilities": ["credential.read"]
},
{
"name": "cloud-browser",
"command": "agent-browser-plugin-cloud-browser",
"capabilities": ["browser.provider"]
},
{
"name": "stealth",
"command": "agent-browser-plugin-stealth",
"capabilities": ["launch.mutate"]
},
{
"name": "captcha",
"command": "agent-browser-plugin-captcha",
"capabilities": ["command.run", "captcha.solve"]
}
]
}
Inspect configured plugins:
agent-browser plugin list
agent-browser plugin show vault
Use the plugin for login:
agent-browser auth login my-app --credential-provider vault --item "My App"
Use a plugin as a browser provider:
agent-browser --provider cloud-browser open https://example.com
Use a generic plugin command:
agent-browser plugin run captcha captcha.solve --payload '{"siteKey":"...","url":"https://example.com"}'
Credential plugins receive credential.resolve and return username, password, and optionally URL or selector metadata. Browser provider plugins receive browser.launch and return a CDP WebSocket URL. Launch mutator plugins receive launch.mutate and can append local launch args, extensions, or init script source before Chrome starts. Generic command plugins receive the request type passed to plugin run.
plugin run is for command.run and custom capabilities. Core capabilities and protocol request types use their dedicated command paths so credential, browser-provider, and launch-mutator access stays inside the normal policy gates.
agent-browser keeps browser automation, redaction-sensitive output, and policy enforcement in core. Credential plugin secrets do not appear in command arguments, dashboard events, or normal command output.
Gate plugin access with capability actions:
agent-browser --confirm-actions plugin:vault:credential.read auth login my-app --credential-provider vault --item "My App"
agent-browser --confirm-actions plugin:cloud-browser:browser.provider --provider cloud-browser open https://example.com
agent-browser --confirm-actions plugin:stealth:launch.mutate open https://example.com
When --content-boundaries is enabled, all page-sourced output is wrapped in structural markers so LLMs can distinguish tool output from untrusted page content:
--- AGENT_BROWSER_PAGE_CONTENT nonce=a1b2c3d4 origin=https://example.com ---
[snapshot / text / html / eval output here]
--- END_AGENT_BROWSER_PAGE_CONTENT nonce=a1b2c3d4 ---
The nonce is a random value generated per CLI process invocation, making it unpredictable to page content that might attempt to spoof the boundary.
Enable via flag or environment variable:
agent-browser --content-boundaries snapshot
# or
export AGENT_BROWSER_CONTENT_BOUNDARIES=1
Affected output types: snapshot, get text, get html, eval, console.
In --json mode, boundary metadata is injected into the JSON response as a _boundary object containing nonce and origin fields, allowing orchestrators to verify provenance programmatically:
{
"success": true,
"data": { "snapshot": "...", "origin": "https://example.com" },
"_boundary": { "nonce": "a1b2c3d4e5f6...", "origin": "https://example.com" }
}
Restrict which domains the browser can interact with, preventing redirect-based attacks and data exfiltration:
agent-browser --allowed-domains "example.com,*.example.com,github.com" open https://example.com
# or
export AGENT_BROWSER_ALLOWED_DOMAINS="example.com,*.example.com"
Supports exact match (github.com) and wildcard prefix (*.example.com, which also matches the bare domain example.com). Both page navigations and sub-resource requests (scripts, images, fetch, XHR, etc.) to non-allowed domains are blocked, preventing data exfiltration. WebSocket and EventSource connections are also blocked via constructor-level patching. Non-http(s) sub-resources (data URIs, blobs) are still allowed. When a request is blocked, the command returns an error.
Note: The WebSocket/EventSource blocking is best-effort -- it works by overriding the browser constructors via an init script. If the
evalaction category is allowed, page scripts could theoretically restore the original constructors. For maximum protection, deny theevalcategory via--action-policywhen using--allowed-domains.
Config file:
{
"allowedDomains": ["example.com", "*.example.com", "github.com"]
}
CDN and third-party resources: The domain filter blocks all sub-resource requests (scripts, stylesheets, images, fonts, fetch/XHR) to non-allowed domains. Most websites load assets from CDN domains. Include these in your allowlist or pages will break. For example:
bash--allowed-domains "myapp.com,*.myapp.com,cdn.jsdelivr.net,fonts.googleapis.com,fonts.gstatic.com"
Gate actions using a static policy file. The policy is enforced by the daemon -- denied actions fail immediately.
agent-browser --action-policy ./policy.json open https://example.com
# or
export AGENT_BROWSER_ACTION_POLICY=./policy.json
Example policy (permissive with specific denials):
{
"default": "allow",
"deny": ["eval", "download", "upload"]
}
Example policy (restrictive):
{
"default": "deny",
"allow": ["navigate", "snapshot", "click", "scroll", "wait", "get"]
}
Auth vault operations keep secrets out of normal command output and LLM context. Domain allowlist restrictions still apply to auth login navigations. Plugin-backed logins also expose the capability action plugin:<name>:credential.read for policy and confirmation gates.
For actions that require explicit approval, use --confirm-actions to specify categories that require confirmation:
# Orchestrator mode: returns confirmation_required response
agent-browser --confirm-actions eval,download eval "document.title"
# Then approve or deny:
agent-browser confirm c_8f3a1234
agent-browser deny c_8f3a1234
For interactive (human-in-the-loop) confirmation:
agent-browser --confirm-actions eval,download --confirm-interactive eval "document.title"
# Prompts: Allow? [y/N]
Pending confirmations auto-deny after 60 seconds.
Non-TTY behavior: When
--confirm-interactiveis set but stdin is not a TTY (e.g., piped input or running inside an automated pipeline), actions are automatically denied. This prevents accidental approval in non-interactive contexts.
Prevent context flooding by truncating large page outputs:
agent-browser --max-output 50000 get text body
# or
export AGENT_BROWSER_MAX_OUTPUT=50000
Affected output types: snapshot, get text, get html, eval, console.
For production AI agent deployments:
{
"contentBoundaries": true,
"maxOutput": 50000,
"allowedDomains": ["your-app.com", "*.your-app.com"],
"actionPolicy": "./policy.json"
}