examples/operator-browserbase/README.md
Fork from browserbase/open-operator
[!WARNING] This is simply a proof of concept. Browserbase aims not to compete with web agents, but rather to provide all the necessary tools for anybody to build their own web agent. We strongly recommend you check out both Browserbase and our open source project Stagehand to build your own web agent.
https://github.com/user-attachments/assets/354c3b8b-681f-4ad0-9ab9-365dbde894af
First, install the dependencies for this repository. This requires pnpm.
<!-- This doesn't work with NPM, haven't tested with yarn -->pnpm install
Next, copy the example environment variables:
cp .env.example .env.local
You'll need to set up your API keys:
Update .env.local with your API keys:
UI_TARS_BASE_URL: Your UI-TARS Base UrlUI_TARS_API_KEY: Your UI-TARS API KeyUI_TARS_MODEL: Your UI-TARS ModelBROWSERBASE_API_KEY: Your Browserbase API keyBROWSERBASE_PROJECT_ID: Your Browserbase project IDThen, run the development server:
<!-- This doesn't work with NPM, haven't tested with yarn -->pnpm dev
Open http://localhost:3000 with your browser to see Open Operator in action.
Building a web agent is a complex task. You need to understand the user's intent, convert it into headless browser operations, and execute actions, each of which can be incredibly complex on their own.
Stagehand is a tool that helps you build web agents. It allows you to convert natural language into headless browser operations, execute actions on the browser, and extract results back into structured data.
Under the hood, we have a very simple agent loop that just calls Stagehand to convert the user's intent into headless browser operations, and then calls Browserbase to execute those operations.
Stagehand uses Browserbase to execute actions on the browser, and OpenAI to understand the user's intent.
For more on this, check out the code at this commit.
We welcome contributions! Whether it's:
Please feel free to open issues and pull requests.
Open Operator is open source software licensed under the MIT license.
This project is inspired by OpenAI's Operator feature and builds upon various open source technologies including Next.js, React, Browserbase, and Stagehand.