Back to Daytona

ComputerUse

apps/docs/src/content/docs/en/java-sdk/computer-use.mdx

0.173.07.5 KB
Original Source

ComputerUse

Desktop automation operations for a Sandbox.

Provides a Java facade for computer-use features including desktop session management, screenshots, mouse and keyboard automation, display/window inspection, and screen recording.

Methods

start()

java
public ComputerUseStartResponse start()

Starts the computer-use desktop stack (VNC/noVNC and related processes).

Returns:

  • ComputerUseStartResponse - start response containing process status details

stop()

java
public ComputerUseStopResponse stop()

Stops all computer-use desktop processes.

Returns:

  • ComputerUseStopResponse - stop response containing process status details

getStatus()

java
public ComputerUseStatusResponse getStatus()

Returns current computer-use status.

Returns:

  • ComputerUseStatusResponse - overall computer-use status

takeScreenshot()

java
public ScreenshotResponse takeScreenshot()

Captures a full-screen screenshot without cursor.

Returns:

  • ScreenshotResponse - screenshot payload (base64 image and metadata)

takeScreenshot()

java
public ScreenshotResponse takeScreenshot(boolean showCursor)

Captures a full-screen screenshot.

Parameters:

  • showCursor boolean - whether to render cursor in the screenshot

Returns:

  • ScreenshotResponse - screenshot payload (base64 image and metadata)

takeRegionScreenshot()

java
public ScreenshotResponse takeRegionScreenshot(int x, int y, int width, int height)

Captures a screenshot of a rectangular region without cursor.

Parameters:

  • x int - region top-left X coordinate
  • y int - region top-left Y coordinate
  • width int - region width in pixels
  • height int - region height in pixels

Returns:

  • ScreenshotResponse - region screenshot payload

takeCompressedScreenshot()

java
public ScreenshotResponse takeCompressedScreenshot(String format, int quality, double scale)

Captures a compressed full-screen screenshot.

Parameters:

  • format String - output image format (for example: png, jpeg, webp)
  • quality int - compression quality (typically 1-100, format dependent)
  • scale double - screenshot scale factor (for example: 0.5 for 50%)

Returns:

  • ScreenshotResponse - compressed screenshot payload

click()

java
public MouseClickResponse click(int x, int y)

Performs a left mouse click at the given coordinates.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate

Returns:

  • MouseClickResponse - click response with resulting cursor position

click()

java
public MouseClickResponse click(int x, int y, String button)

Performs a mouse click at the given coordinates with a specific button.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate
  • button String - button type (left, right, middle)

Returns:

  • MouseClickResponse - click response with resulting cursor position

doubleClick()

java
public MouseClickResponse doubleClick(int x, int y)

Performs a double left-click at the given coordinates.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate

Returns:

  • MouseClickResponse - click response with resulting cursor position

moveMouse()

java
public MousePositionResponse moveMouse(int x, int y)

Moves the mouse cursor to the given coordinates.

Parameters:

  • x int - target X coordinate
  • y int - target Y coordinate

Returns:

  • MousePositionResponse - new mouse position

getMousePosition()

java
public MousePositionResponse getMousePosition()

Returns current mouse position.

Returns:

  • MousePositionResponse - current mouse cursor coordinates

drag()

java
public MouseDragResponse drag(int startX, int startY, int endX, int endY)

Drags the mouse from one point to another using the left button.

Parameters:

  • startX int - drag start X coordinate
  • startY int - drag start Y coordinate
  • endX int - drag end X coordinate
  • endY int - drag end Y coordinate

Returns:

  • MouseDragResponse - drag response with resulting cursor position

scroll()

java
public ScrollResponse scroll(int x, int y, int deltaX, int deltaY)

Scrolls at the given coordinates.

The current toolbox API supports directional scrolling (up/down) with an amount. This method maps deltaY to vertical scroll direction and magnitude. If deltaY is 0, deltaX is used as a fallback.

Parameters:

  • x int - anchor X coordinate
  • y int - anchor Y coordinate
  • deltaX int - horizontal delta (used only when deltaY == 0)
  • deltaY int - vertical delta

Returns:

  • ScrollResponse - scroll response indicating operation success

typeText()

java
public void typeText(String text)

Types text using keyboard automation.

Parameters:

  • text String - text to type

pressKey()

java
public void pressKey(String key)

Presses a single key.

Parameters:

  • key String - key to press. Canonical names include enter, escape, tab, letters, digits, unshifted punctuation, function keys, and grammar-safe numpad names such as num_plus. Named keys are case-insensitive, and common aliases such as Return and Escape are normalized.

pressHotkey()

java
public void pressHotkey(String... keys)

Presses a key combination as a hotkey sequence.

Keys are joined with + before being sent (for example, pressHotkey("ctrl", "shift", "t") -> "ctrl+shift+t"). The resulting value is a single atomic chord and uses the same normalized key contract as #pressKey(String).

Parameters:

  • keys String... - hotkey parts to combine

getDisplayInfo()

java
public DisplayInfoResponse getDisplayInfo()

Returns display configuration information.

Returns:

  • DisplayInfoResponse - display information including available displays and their geometry

getWindows()

java
public WindowsResponse getWindows()

Returns currently open windows.

Returns:

  • WindowsResponse - window list and metadata

startRecording()

java
public Recording startRecording()

Starts a recording with default options.

Returns:

  • Recording - newly started recording metadata

startRecording()

java
public Recording startRecording(String label)

Starts a recording with an optional label.

Parameters:

  • label String - optional recording label

Returns:

  • Recording - newly started recording metadata

stopRecording()

java
public Recording stopRecording(String id)

Stops an active recording.

Parameters:

  • id String - recording identifier

Returns:

  • Recording - finalized recording metadata

listRecordings()

java
public ListRecordingsResponse listRecordings()

Lists all recordings for the current sandbox session.

Returns:

  • ListRecordingsResponse - recordings list response

getRecording()

java
public Recording getRecording(String id)

Returns metadata for a specific recording.

Parameters:

  • id String - recording identifier

Returns:

  • Recording - recording details

downloadRecording()

java
public File downloadRecording(String id)

Downloads a recording file.

Parameters:

  • id String - recording identifier

Returns:

  • File - downloaded temporary/local file handle returned by the API client

deleteRecording()

java
public void deleteRecording(String id)

Deletes a recording.

Parameters:

  • id String - recording identifier