multimodal/gui-agent/operator-adb/README.md
Adb Operator is an Android operator based on ADB for GUI Agent. It provides a set of APIs to interact with Android devices, including taking screenshots, touch operations, keyboard operations, and more.
npm install @gui-agent/operator-adb
Or with yarn:
yarn add @gui-agent/operator-adb
Or with pnpm:
pnpm add @gui-agent/operator-adb
import { AdbOperator } from '@gui-agent/operator-adb';
import { ConsoleLogger, LogLevel } from '@agent-infra/logger';
// Create a logger
const logger = new ConsoleLogger(undefined, LogLevel.DEBUG);
// Create an operator instance
const operator = new AdbOperator(logger);
// Initialize the operator
await operator.initialize();
// Take a screenshot
const screenshot = await operator.doScreenshot();
console.log('Screenshot taken:', screenshot.status);
// Execute actions
const result = await operator.doExecute({
actions: [
{
type: 'click',
x: 500,
y: 300
},
{
type: 'type',
text: 'Hello, World!'
}
]
});
AdbOperatorThe main class that provides methods to interact with Android devices.
constructor(logger: ConsoleLogger = defaultLogger)
logger: A ConsoleLogger instance for logging. Default is a ConsoleLogger with LogLevel.DEBUG.initialize(): Promise<void>Initializes the operator by connecting to an Android device.
screenshot(): Promise<ScreenshotOutput>Takes a screenshot of the Android device screen.
ScreenshotOutput object containing:
base64: The base64-encoded image datastatus: The status of the operation ('success' or 'error')execute(params: ExecuteParams): Promise<ExecuteOutput>Executes a list of actions on the Android device.
params: An object containing:
actions: An array of action objectsExecuteOutput object containing:
status: The status of the operation ('success' or 'error')click, tap: Perform a tap at specified coordinatesswipe: Swipe from one position to anotherlong_press: Long press at specified coordinatestype: Type texthotkey: Press a key combinationpress: Press a keyrelease: Release a keywait: Wait for a specified timeThis project uses YADB for enhanced ADB functionality, particularly for screenshot capture in restricted apps and other advanced features. We're grateful to the YADB team for their excellent work extending ADB capabilities.
This project is licensed under the Apache-2.0 License.