packages/computeruse/crates/computeruse-cli/README.md
The ComputerUse CLI is a powerful command-line tool for managing the ComputerUse project, including version management, releases, Azure VM deployment, and MCP server interaction.
Windows (Recommended - npm wrapper):
# Run directly without installation
npx @mediar-ai/cli --help
bunx @mediar-ai/cli --help
# Or install globally
npm install -g @mediar-ai/cli
macOS / Linux (Compile from Source):
ā ļø The npm package @mediar-ai/cli only includes Windows binaries. Other platforms must compile from source.
# From the workspace root
cargo build --release --bin computeruse
# Install globally (optional)
cargo install --path crates/computeruse-cli
# Bump version
computeruse patch # x.y.Z+1
computeruse minor # x.Y+1.0
computeruse major # X+1.0.0
# Sync all package versions
computeruse sync
# Show current status
computeruse status
# Tag and push
computeruse tag
# Full release (bump + tag + push)
computeruse release # patch release
computeruse release minor # minor release
Execute automation workflows from YAML or JSON files:
# Execute a workflow file
computeruse mcp run workflow.yml
# Execute with verbose logging
computeruse mcp run workflow.yml --verbose
# Dry run (validate without executing)
computeruse mcp run workflow.yml --dry-run
# Use specific MCP server command
computeruse mcp run workflow.yml --command "npx -y computeruse-mcp-agent@latest"
# Use HTTP MCP server
computeruse mcp run workflow.yml --url http://localhost:3000/mcp
# Pass input values to workflow (available as env variables in scripts)
computeruse mcp run workflow.yml --inputs '{"username":"john","api_key":"abc123"}'
# Combine inputs with other options
computeruse mcp run workflow.yml --inputs '{"count":5}' --verbose
The --inputs parameter allows you to pass initial values to your workflow that can be accessed by JavaScript/Python scripts:
# Pass inputs as JSON
computeruse mcp run workflow.yml --inputs '{"user":"alice","count":42,"enabled":true}'
These inputs are accessible in your workflow scripts:
env.inputs object or directly as env.username, env.count, etc.env dictionaryExample workflow file (workflow.yml):
tool_name: execute_sequence
arguments:
steps:
- tool_name: navigate_browser
arguments:
url: "https://example.com"
- tool_name: click_element
arguments:
selector: "role:Button && name:Submit"
- tool_name: get_applications_and_windows_list
id: get_apps
- tool_name: run_command
engine: javascript
id: extract_pid
run: |
const apps = get_apps_result[0]?.applications || [];
const focused = apps.find(app => app.is_focused);
return { pid: focused?.pid || 0 };
- tool_name: get_window_tree
arguments:
pid: "{{extract_pid.pid}}"
id: capture_result
output_parser:
ui_tree_source_step_id: capture_result
javascript_code: |
// Extract all checkbox names
const results = [];
function findElementsRecursively(element) {
if (element.attributes && element.attributes.role === 'CheckBox') {
const item = {
name: element.attributes.name || ''
};
results.push(item);
}
if (element.children) {
for (const child of element.children) {
findElementsRecursively(child);
}
}
}
findElementsRecursively(tree);
return results;
JavaScript execution in workflows:
tool_name: execute_sequence
arguments:
steps:
- tool_name: run_command
arguments:
engine: "javascript"
run: |
// Access inputs passed from CLI via --inputs parameter
console.log(`Processing for user: ${env.username}`);
console.log(`Count value: ${env.count}`);
// Or access the entire inputs object
const allInputs = env.inputs;
console.log(`All inputs:`, JSON.stringify(allInputs));
// Use inputs in your logic
for (let i = 0; i < env.count; i++) {
console.log(`Processing item ${i + 1} for ${env.username}`);
}
return {
processed_by: env.username,
items_processed: env.count
};
Original example:
tool_name: execute_sequence
arguments:
steps:
- tool_name: run_command
arguments:
engine: "node"
script: |
// Access desktop automation APIs
const elements = await desktop.locator('role:button').all();
log(`Found ${elements.length} buttons`);
// Interact with UI elements
for (const element of elements) {
const name = await element.name();
if (name.includes('Submit')) {
await element.click();
break;
}
}
return {
buttons_found: elements.length,
action: 'clicked_submit'
};
Execute individual MCP tools directly:
# Execute a single tool
computeruse mcp exec get_applications
# Execute with arguments
computeruse mcp exec click_element '{"selector": "role:Button && name:OK"}'
# Use different MCP server
computeruse mcp exec --url http://localhost:3000/mcp validate_element '{"selector": "#button"}'
Chat with MCP servers interactively:
# Start chat session (uses local MCP server by default)
computeruse mcp chat
# Chat with remote MCP server
computeruse mcp chat --url https://your-server.com/mcp
# Chat with specific MCP server command
computeruse mcp chat --command "node my-mcp-server.js"
computeruse mcp chat --url https://xxx/mcp
The CLI supports multiple ways to connect to MCP servers:
# Local MCP server (default - uses @latest for compatibility)
computeruse mcp run workflow.yml
# Specific version
computeruse mcp run workflow.yml --command "npx -y [email protected]"
# HTTP server
computeruse mcp run workflow.yml --url http://localhost:3000/mcp
# Custom server command
computeruse mcp run workflow.yml --command "python my_mcp_server.py"
The CLI supports both YAML and JSON workflow files:
Direct workflow (workflow.yml):
steps:
- tool_name: navigate_browser
arguments:
url: "https://example.com"
stop_on_error: true
Tool call wrapper (workflow.json):
{
"tool_name": "execute_sequence",
"arguments": {
"steps": [
{
"tool_name": "navigate_browser",
"arguments": {
"url": "https://example.com"
}
}
]
}
}
# Continue on errors
computeruse mcp run workflow.yml --continue-on-error
# Custom timeout
computeruse mcp run workflow.yml --timeout 30000
# Detailed error output
computeruse mcp run workflow.yml --verbose
The CLI supports executing code within workflows using the run_command tool in engine mode, providing access to desktop automation APIs:
Available Engines:
nodejs - Full Node.js runtime with desktop APIsquickjs - Lightweight JavaScript engine (default)Desktop APIs Available:
// Element discovery
const elements = await desktop.locator('role:Button && name:Submit').all();
const element = await desktop.locator('#button-id').first();
// Element interaction
await element.click();
await element.type('Hello World');
await element.setToggled(true);
// Property access
const name = await element.name();
const bounds = await element.bounds();
const isEnabled = await element.enabled();
// Utilities
log('Debug message'); // Logging
await sleep(1000); // Delay in milliseconds
Example Use Cases:
# Conditional logic based on UI state
- tool_name: run_command
arguments:
engine: "node"
script: |
const submitButton = await desktop.locator('role:Button && name:Submit').first();
const isEnabled = await submitButton.enabled();
if (isEnabled) {
await submitButton.click();
return { action: 'submitted' };
} else {
log('Submit button is disabled, checking form validation...');
return { action: 'validation_needed' };
}
# Bulk operations on multiple elements
- tool_name: run_command
arguments:
engine: "javascript"
script: |
const checkboxes = await desktop.locator('role:checkbox').all();
let enabledCount = 0;
for (const checkbox of checkboxes) {
await checkbox.setToggled(true);
enabledCount++;
await sleep(50); // Small delay between operations
}
return { total_enabled: enabledCount };
# Dynamic element discovery and interaction
- tool_name: run_command
arguments:
engine: "javascript"
script: |
// Find all buttons containing specific text
const buttons = await desktop.locator('role:button').all();
const targets = [];
for (const button of buttons) {
const name = await button.name();
if (name.toLowerCase().includes('download')) {
targets.push(name);
await button.click();
await sleep(1000);
}
}
return { downloaded_items: targets };
RUST_LOG: Set logging level (e.g., debug, info, warn, error)MCP_SERVER_URL: Default MCP server URLMCP_SERVER_COMMAND: Default MCP server commandnpx -y computeruse-mcp-agent@latest by default--verboseIf you encounter "missing field" errors, ensure you're using the latest MCP server:
# Force latest version
computeruse mcp run workflow.yml --command "npx -y computeruse-mcp-agent@latest"
# Clear npm cache if needed
npm cache clean --force
# Test MCP server connectivity
computeruse mcp exec get_applications
# Use verbose logging for debugging
computeruse mcp run workflow.yml --verbose
# Test with dry run first
computeruse mcp run workflow.yml --dry-run
# Test JavaScript execution capability via run_command (engine mode)
computeruse mcp exec run_command '{"engine": "javascript", "run": "return {test: true};"}'
# Use node engine for full APIs
computeruse mcp exec run_command '{"engine": "node", "run": "const elements = await desktop.locator(\"role:button\").all(); return {count: elements.length};"}'
# Run Python with computeruse.py
computeruse mcp exec run_command '{"engine": "python", "run": "return {\"ok\": True}"}'
# Debug JavaScript errors with verbose logging
computeruse mcp run workflow.yml --verbose
For more examples and advanced usage, see the ComputerUse MCP Agent documentation.