skills/open-source/references/actor.md
Low-level Playwright-like browser automation built on CDP. Use for precise, deterministic operations alongside the AI agent.
Browser (BrowserSession) → Page → Element
→ Mouse
→ AI Features (extract, find by prompt)
NOT Playwright — built on CDP with a subset of the Playwright API. Key differences:
get_elements_by_css_selector() returns immediately (no visibility wait)evaluate() requires arrow function format: () => {}browser = Browser()
await browser.start()
page = await browser.new_page("https://example.com") # Open new tab
pages = await browser.get_pages() # List all pages
current = await browser.get_current_page() # Active page
await browser.close_page(page) # Close tab
await browser.stop() # Cleanup
goto(url: str) — Navigate to URLgo_back() — Back in historygo_forward() — Forward in historyreload() — Reload pageget_elements_by_css_selector(selector: str) -> list[Element] — Immediate returnget_element(backend_node_id: int) -> Element — By CDP node IDget_element_by_prompt(prompt: str, llm) -> Element | None — LLM-poweredmust_get_element_by_prompt(prompt: str, llm) -> Element — Raises if not foundevaluate(page_function: str, *args) -> str — Execute JS (arrow function format)press(key: str) — Keyboard inputset_viewport_size(width: int, height: int)screenshot(format='jpeg', quality=None) -> str — Base64 screenshotget_url() -> strget_title() -> strmouse -> Mouse — Mouse instanceextract_content(prompt: str, structured_output: type[T], llm) -> T — LLM-powered extractionclick(button='left', click_count=1, modifiers=None)fill(text: str, clear=True) — Clear field and typehover()focus()check() — Toggle checkbox/radioselect_option(values: str | list[str]) — Select dropdowndrag_to(target: Element | Position)get_attribute(name: str) -> str | Noneget_bounding_box() -> BoundingBox | Noneget_basic_info() -> ElementInfoscreenshot(format='jpeg') -> strmouse = page.mouse
await mouse.click(x=100, y=200, button='left', click_count=1)
await mouse.move(x=500, y=600, steps=1)
await mouse.down(button='left')
await mouse.up(button='left')
await mouse.scroll(x=0, y=100, delta_x=None, delta_y=-500)
async def main():
llm = ChatOpenAI(api_key="your-key")
browser = Browser()
await browser.start()
# Actor: precise navigation
page = await browser.new_page("https://github.com/login")
email = await page.must_get_element_by_prompt("username field", llm=llm)
await email.fill("your-username")
# Agent: AI-driven completion
agent = Agent(browser=browser, llm=llm)
await agent.run("Complete login and navigate to repositories")
await browser.stop()
title = await page.evaluate('() => document.title')
result = await page.evaluate('(x, y) => x + y', 10, 20)
stats = await page.evaluate('''() => ({
url: location.href,
links: document.querySelectorAll('a').length
})''')
from pydantic import BaseModel
class ProductInfo(BaseModel):
name: str
price: float
product = await page.extract_content("Extract product name and price", ProductInfo, llm=llm)
asyncio.sleep() after navigation-triggering actionsbrowser.stop() for cleanup