docs/superpowers/plans/2026-05-28-computer-use-built-in.md
For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (
- [ ]) syntax for tracking.
Goal: Make open-computer-use a zero-config built-in capability in qwen-code. 9 computer-use tools appear in the deferred tool list as computer_use__click, computer_use__type_text, etc. First invocation transparently installs the upstream npm binary, walks the user through macOS Accessibility / Screen Recording permissions if needed, and forwards the call to the upstream MCP server.
Architecture: Thin shell over upstream npx -y open-computer-use mcp. We do NOT bundle the binary; upstream's npx cache + .app bundle handles distribution and macOS TCC. 9 tools are registered as parameterized ComputerUseTool instances (one per tool name) backed by a singleton ComputerUseClient that owns a long-running MCP stdio child process. Bootstrap state machine layers on top: standard qwen-code tool permission (existing) → first-time install confirm → optional macOS permission guide.
Tech Stack: TypeScript, vitest, @modelcontextprotocol/sdk (already a qwen-code dep), node:child_process, node:fs/promises.
New files:
packages/core/src/tools/computer-use/
index.ts # registerComputerUseTools(registry, config); barrel export
schemas.ts # hardcoded 9 schemas + descriptions (synced from upstream)
tool.ts # ComputerUseTool — parameterized BaseDeclarativeTool
client.ts # ComputerUseClient — singleton MCP stdio process manager
bootstrap.ts # state machine: probe → install confirm → install → perm guide
install-state.ts # ~/.qwen/computer-use/installed.json read/write
permission-detector.ts # parse upstream error strings to detect missing perms
schemas.test.ts # all 9 schemas parse, names match contract
tool.test.ts # parameterized tool wiring
client.test.ts # client lifecycle (mocked spawn)
bootstrap.test.ts # state machine transitions
install-state.test.ts # state file round-trip
permission-detector.test.ts # error pattern matching
scripts/
sync-computer-use-schemas.ts # release-time script: dump upstream tools/list → schemas.ts
Modified files:
packages/core/src/tools/tool-names.ts # add 9 COMPUTER_USE_* constants
packages/core/src/config/config.ts # add computerUseEnabled field + isComputerUseEnabled() + register call in createToolRegistry()
packages/cli/src/config/config.ts # map settings.tools.computerUse.enabled → ConfigParameters.computerUseEnabled
packages/cli/src/config/settingsSchema.ts # add tools.computerUse.enabled boolean (default true)
Decomposition rationale: Each file has one responsibility. client.ts knows MCP protocol but not UX; bootstrap.ts knows UX but doesn't touch MCP details; tool.ts is pure plumbing that wires them via execute(). Tests live next to code. Schemas are isolated so the sync script can rewrite the file without churning logic.
Files:
Modify: packages/core/src/tools/tool-names.ts
Step 1: Add the 9 name constants
Edit packages/core/src/tools/tool-names.ts — inside the ToolNames object, after EXIT_WORKTREE: 'exit_worktree',:
// Computer Use tools — built-in but backed by an upstream MCP server.
// All deferred; revealed only when the user-initiated request triggers
// a computer-use action. See packages/core/src/tools/computer-use/.
COMPUTER_USE_LIST_APPS: 'computer_use__list_apps',
COMPUTER_USE_GET_APP_STATE: 'computer_use__get_app_state',
COMPUTER_USE_CLICK: 'computer_use__click',
COMPUTER_USE_PERFORM_SECONDARY_ACTION: 'computer_use__perform_secondary_action',
COMPUTER_USE_SCROLL: 'computer_use__scroll',
COMPUTER_USE_DRAG: 'computer_use__drag',
COMPUTER_USE_TYPE_TEXT: 'computer_use__type_text',
COMPUTER_USE_PRESS_KEY: 'computer_use__press_key',
COMPUTER_USE_SET_VALUE: 'computer_use__set_value',
Mirror in ToolDisplayNames:
COMPUTER_USE_LIST_APPS: 'computer_use__list_apps',
COMPUTER_USE_GET_APP_STATE: 'computer_use__get_app_state',
COMPUTER_USE_CLICK: 'computer_use__click',
COMPUTER_USE_PERFORM_SECONDARY_ACTION: 'computer_use__perform_secondary_action',
COMPUTER_USE_SCROLL: 'computer_use__scroll',
COMPUTER_USE_DRAG: 'computer_use__drag',
COMPUTER_USE_TYPE_TEXT: 'computer_use__type_text',
COMPUTER_USE_PRESS_KEY: 'computer_use__press_key',
COMPUTER_USE_SET_VALUE: 'computer_use__set_value',
(displayName == name on purpose; we don't want capitalized display names like Click showing in the permission dialog when the tool name is computer_use__click.)
Run: npm test -- packages/core/src/tools/tool-names
Expected: PASS (if there's no test file, run npm run build -- --filter @qwen-code/qwen-code-core to typecheck)
git add packages/core/src/tools/tool-names.ts
git commit -m "feat(computer-use): add tool name constants"
Files:
packages/core/src/tools/computer-use/schemas.tspackages/core/src/tools/computer-use/schemas.test.tsThe 9 schemas mirror upstream open-computer-use mcp tools/list output. These are pinned to upstream version ^0.x.y (TODO: fill in the actual pin at the top of schemas.ts when implementing — run npx -y open-computer-use@latest --version to capture the current latest).
Create packages/core/src/tools/computer-use/schemas.test.ts:
import { describe, it, expect } from 'vitest';
import { COMPUTER_USE_SCHEMAS, COMPUTER_USE_TOOL_NAMES } from './schemas.js';
describe('computer-use schemas', () => {
it('exports exactly 9 schemas', () => {
expect(Object.keys(COMPUTER_USE_SCHEMAS)).toHaveLength(9);
});
it('each tool name matches the upstream convention (no computer_use__ prefix)', () => {
// schemas.ts uses upstream names verbatim ("click", "type_text").
// The computer_use__ prefix lives on the qwen-code-facing wrapper.
for (const name of COMPUTER_USE_TOOL_NAMES) {
expect(name).not.toContain('computer_use__');
expect(name).toMatch(/^[a-z_]+$/);
}
});
it('every schema has the standard object structure', () => {
for (const [name, schema] of Object.entries(COMPUTER_USE_SCHEMAS)) {
expect(schema.description, `${name} missing description`).toBeTruthy();
expect(
schema.parameterSchema,
`${name} missing parameterSchema`,
).toBeTruthy();
expect((schema.parameterSchema as { type: string }).type).toBe('object');
}
});
it('list_apps takes no parameters', () => {
expect(COMPUTER_USE_SCHEMAS.list_apps.parameterSchema).toEqual({
type: 'object',
properties: {},
additionalProperties: false,
});
});
it('click requires app and either element_index or x/y', () => {
const schema = COMPUTER_USE_SCHEMAS.click.parameterSchema as {
properties: Record<string, unknown>;
required: string[];
};
expect(schema.properties).toHaveProperty('app');
expect(schema.properties).toHaveProperty('element_index');
expect(schema.properties).toHaveProperty('x');
expect(schema.properties).toHaveProperty('y');
expect(schema.required).toContain('app');
});
});
Run: npm test -- packages/core/src/tools/computer-use/schemas.test.ts
Expected: FAIL with "Cannot find module './schemas.js'"
Create packages/core/src/tools/computer-use/schemas.ts. The schemas below are MVP — they reflect upstream's tool surface and parameter naming. The sync-computer-use-schemas.ts script (Task 13) will regenerate this file from a live upstream snapshot in CI before each qwen-code release.
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
/**
* Hardcoded schemas for the 9 upstream open-computer-use tools.
*
* Pinned to upstream version: <PIN_VERSION_DURING_IMPL>
*
* Regenerated by `scripts/sync-computer-use-schemas.ts` — do not hand-edit.
* The upstream tool names ("click", "type_text") appear verbatim here;
* the `computer_use__` prefix is added by the qwen-code-facing wrapper in
* `tool.ts` so the model sees `computer_use__click` without any MCP
* concept leaking through.
*/
export interface ComputerUseToolSchema {
description: string;
parameterSchema: Record<string, unknown>;
}
export const COMPUTER_USE_TOOL_NAMES = [
'list_apps',
'get_app_state',
'click',
'perform_secondary_action',
'scroll',
'drag',
'type_text',
'press_key',
'set_value',
] as const;
export type ComputerUseToolName = (typeof COMPUTER_USE_TOOL_NAMES)[number];
export const COMPUTER_USE_SCHEMAS: Record<
ComputerUseToolName,
ComputerUseToolSchema
> = {
list_apps: {
description:
'List running and recently-used desktop applications on the current machine. Returns each app with a bundle identifier and display name. Use this before get_app_state to discover what is available to interact with.',
parameterSchema: {
type: 'object',
properties: {},
additionalProperties: false,
},
},
get_app_state: {
description:
'Capture the current accessibility tree and a screenshot of the given application. Returns element_index values that subsequent actions (click, set_value, etc.) can target. Always call this before any element-targeted action; element_index values are valid only within the current snapshot.',
parameterSchema: {
type: 'object',
properties: {
app: {
type: 'string',
description:
'Application bundle identifier or display name (e.g. "TextEdit", "com.apple.Safari").',
},
},
required: ['app'],
additionalProperties: false,
},
},
click: {
description:
'Left-click a target. Prefer element_index from a recent get_app_state result. Fall back to x/y screenshot pixel coordinates only when no AX element matches the target.',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string', description: 'Target application.' },
element_index: {
type: 'integer',
description: 'Index into the latest get_app_state element list.',
},
x: {
type: 'integer',
description: 'X coordinate in screenshot pixels.',
},
y: {
type: 'integer',
description: 'Y coordinate in screenshot pixels.',
},
click_count: {
type: 'integer',
description: 'Number of clicks (1 = single, 2 = double).',
default: 1,
},
},
required: ['app'],
additionalProperties: false,
},
},
perform_secondary_action: {
description:
'Perform a non-click semantic action exposed by the target AX element (e.g. "Raise", "ShowMenu"). Returns an error if the action is not valid for the element.',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string' },
element_index: { type: 'integer' },
action: {
type: 'string',
description: 'AX action name to perform.',
},
},
required: ['app', 'element_index', 'action'],
additionalProperties: false,
},
},
scroll: {
description:
'Scroll inside the target element or at the given coordinates. `pages` is a fractional page count (positive = down, negative = up).',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string' },
element_index: { type: 'integer' },
x: { type: 'integer' },
y: { type: 'integer' },
pages: {
type: 'number',
description: 'Fractional page count to scroll (negative = up).',
},
},
required: ['app', 'pages'],
additionalProperties: false,
},
},
drag: {
description:
'Drag from one coordinate pair to another inside the target application window. Coordinates are in screenshot pixels.',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string' },
from_x: { type: 'integer' },
from_y: { type: 'integer' },
to_x: { type: 'integer' },
to_y: { type: 'integer' },
},
required: ['app', 'from_x', 'from_y', 'to_x', 'to_y'],
additionalProperties: false,
},
},
type_text: {
description:
'Type text into the currently-focused text input of the target application. Click the input area first if it is not focused. For unfocused text fields, prefer set_value instead.',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string' },
text: {
type: 'string',
description: 'Text to type. Supports Unicode.',
},
},
required: ['app', 'text'],
additionalProperties: false,
},
},
press_key: {
description:
'Press a keyboard key or combo against the target application. Key names follow xdotool conventions (e.g. "Return", "BackSpace", "cmd+c", "Page_Up").',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string' },
key: { type: 'string' },
},
required: ['app', 'key'],
additionalProperties: false,
},
},
set_value: {
description:
'Directly set the value of a settable AX element (text fields, sliders, etc.). Returns an error if the target is not settable.',
parameterSchema: {
type: 'object',
properties: {
app: { type: 'string' },
element_index: { type: 'integer' },
value: { type: 'string' },
},
required: ['app', 'element_index', 'value'],
additionalProperties: false,
},
},
};
Run: npm test -- packages/core/src/tools/computer-use/schemas.test.ts
Expected: PASS, 5 tests
git add packages/core/src/tools/computer-use/schemas.ts packages/core/src/tools/computer-use/schemas.test.ts
git commit -m "feat(computer-use): hardcode upstream tool schemas"
Files:
Modify: packages/cli/src/config/settingsSchema.ts
Modify: packages/cli/src/config/config.ts
Modify: packages/core/src/config/config.ts
Step 1: Add settings entry
Edit packages/cli/src/config/settingsSchema.ts. The existing schema groups things by category. Computer Use is a tool capability, not experimental — add a new tools subgroup IF it doesn't exist, or add to the existing one. Use grep:
grep -n "tools:" packages/cli/src/config/settingsSchema.ts | head -5
If a tools: key exists, add a new property under it. If not, add a top-level group. Pattern (add near where the experimental.cron entry lives, line ~2298):
tools: {
type: 'object',
label: 'Tools',
category: 'Tools',
requiresRestart: true,
default: {},
description: 'Tool capability toggles.',
showInDialog: false,
properties: {
computerUse: {
type: 'object',
label: 'Computer Use',
category: 'Tools',
requiresRestart: true,
default: {},
description: 'Cross-platform desktop automation via the upstream open-computer-use MCP server. Tools: list_apps, get_app_state, click, type_text, scroll, drag, press_key, perform_secondary_action, set_value. On first invocation, the upstream binary is fetched via npx and the user is walked through macOS Accessibility / Screen Recording permissions if needed.',
showInDialog: false,
properties: {
enabled: {
type: 'boolean',
label: 'Enable Computer Use',
category: 'Tools',
requiresRestart: true,
default: true,
description: 'When enabled (default), the 9 computer_use__* tools are registered as deferred built-ins.',
showInDialog: true,
},
},
},
},
},
If a tools: group already exists, just add the computerUse: property under its properties.
Edit packages/cli/src/config/config.ts. Find the existing line cronEnabled: settings.experimental?.cron ?? false, (around line 1833). Add directly below:
computerUseEnabled: settings.tools?.computerUse?.enabled ?? true,
Edit packages/core/src/config/config.ts:
(a) In ConfigParameters interface (search for cronEnabled?: boolean;), add directly below:
computerUseEnabled?: boolean;
(b) In the Config class fields (search for private readonly cronEnabled: boolean = false;), add directly below:
private readonly computerUseEnabled: boolean = true;
(c) In the Config constructor (search for this.cronEnabled = params.cronEnabled ?? false;), add directly below:
this.computerUseEnabled = params.computerUseEnabled ?? true;
(d) Near isCronEnabled() (search for isCronEnabled(): boolean {), add a sibling getter:
isComputerUseEnabled(): boolean {
return this.computerUseEnabled;
}
Run: npm run build -- --filter @qwen-code/qwen-code-core --filter @qwen-code/qwen-code
Expected: PASS
git add packages/cli/src/config/settingsSchema.ts packages/cli/src/config/config.ts packages/core/src/config/config.ts
git commit -m "feat(computer-use): add enableComputerUse setting (default true)"
Files:
packages/core/src/tools/computer-use/client.tspackages/core/src/tools/computer-use/client.test.tsNote: The client uses @modelcontextprotocol/sdk (already a dep, see packages/core/src/tools/mcp-client.ts). We use StdioClientTransport to spawn npx -y open-computer-use mcp.
Create packages/core/src/tools/computer-use/client.test.ts:
import { describe, it, expect, vi, beforeEach } from 'vitest';
import { ComputerUseClient } from './client.js';
describe('ComputerUseClient', () => {
let client: ComputerUseClient;
beforeEach(() => {
client = new ComputerUseClient({
packageSpec: 'open-computer-use@latest',
onProgress: vi.fn(),
});
});
it('is constructible', () => {
expect(client).toBeDefined();
});
it('reports not-started before start() is called', () => {
expect(client.isStarted()).toBe(false);
});
it('returns the same instance for repeated callers via singleton', () => {
const a = ComputerUseClient.shared();
const b = ComputerUseClient.shared();
expect(a).toBe(b);
});
});
Run: npm test -- packages/core/src/tools/computer-use/client.test.ts
Expected: FAIL — module not found
Create packages/core/src/tools/computer-use/client.ts:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import type {
CallToolResult,
ListToolsResult,
} from '@modelcontextprotocol/sdk/types.js';
/**
* Singleton stdio MCP client for the upstream open-computer-use binary.
*
* Spawned via `npx -y <packageSpec> mcp`. First spawn pays the npx
* download cost (up to ~60s for a fresh cache); subsequent spawns reuse
* the npx cache and are sub-second.
*
* Lifecycle: lazy spawn on first `callTool` invocation. The process
* stays alive until `stop()` or qwen-code exits. State (element_index
* map per app) lives in the process — if the process restarts, the
* model must call `get_app_state` again before any element-targeted
* action.
*/
export interface ComputerUseClientOptions {
/** npm package spec to npx. Example: "open-computer-use@^0.3.0". */
packageSpec: string;
/** Streaming hook for progress messages during slow operations. */
onProgress?: (message: string) => void;
}
export class ComputerUseClient {
private static singleton: ComputerUseClient | undefined;
private readonly packageSpec: string;
private readonly onProgress: (message: string) => void;
private client: Client | undefined;
private transport: StdioClientTransport | undefined;
private startPromise: Promise<void> | undefined;
constructor(options: ComputerUseClientOptions) {
this.packageSpec = options.packageSpec;
this.onProgress = options.onProgress ?? (() => {});
}
/**
* Shared singleton instance, created with default options on first
* access. Tests can replace it via `setSharedForTest()`.
*/
static shared(): ComputerUseClient {
if (!ComputerUseClient.singleton) {
ComputerUseClient.singleton = new ComputerUseClient({
packageSpec:
process.env['QWEN_COMPUTER_USE_PACKAGE'] ??
'open-computer-use@latest',
});
}
return ComputerUseClient.singleton;
}
/** Test-only: replace the singleton. */
static setSharedForTest(replacement: ComputerUseClient | undefined): void {
ComputerUseClient.singleton = replacement;
}
isStarted(): boolean {
return this.client !== undefined;
}
/**
* Start the upstream MCP server. Idempotent: concurrent callers share
* the same in-flight start promise.
*
* Throws on spawn failure (network down, npx missing, etc.). The
* caller (bootstrap state machine) is responsible for mapping the
* throw into user-facing UX.
*/
async start(): Promise<void> {
if (this.client) return;
if (this.startPromise) return this.startPromise;
this.startPromise = this.doStart().finally(() => {
this.startPromise = undefined;
});
return this.startPromise;
}
private async doStart(): Promise<void> {
this.onProgress('Starting Computer Use...');
// After ~3s, surface a hint that the slow path is download.
const downloadHintTimer = setTimeout(() => {
this.onProgress(
'Downloading Computer Use binary (this can take ~60s on first use)...',
);
}, 3000);
try {
const transport = new StdioClientTransport({
command: 'npx',
args: ['-y', this.packageSpec, 'mcp'],
// Inherit env so HTTPS_PROXY etc. flow through to npx
env: { ...process.env } as Record<string, string>,
});
const client = new Client(
{ name: 'qwen-code-computer-use', version: '1.0.0' },
{ capabilities: {} },
);
await client.connect(transport);
this.transport = transport;
this.client = client;
} finally {
clearTimeout(downloadHintTimer);
}
}
/**
* List the tools exposed by the upstream server. Used by the schema
* sync script and bootstrap diagnostics.
*/
async listTools(): Promise<ListToolsResult> {
if (!this.client) throw new Error('ComputerUseClient not started');
return this.client.listTools();
}
/**
* Call a tool by upstream name (NOT the qwen-code-facing
* `computer_use__` prefixed name). Returns the raw MCP result so the
* caller can inspect `isError` and parse text content.
*/
async callTool(
name: string,
args: Record<string, unknown>,
): Promise<CallToolResult> {
if (!this.client) throw new Error('ComputerUseClient not started');
return this.client.callTool({
name,
arguments: args,
}) as Promise<CallToolResult>;
}
/** Tear down the child process. Safe to call multiple times. */
async stop(): Promise<void> {
const client = this.client;
this.client = undefined;
this.transport = undefined;
if (client) {
try {
await client.close();
} catch {
// best-effort cleanup
}
}
}
}
Run: npm test -- packages/core/src/tools/computer-use/client.test.ts
Expected: PASS, 3 tests
git add packages/core/src/tools/computer-use/client.ts packages/core/src/tools/computer-use/client.test.ts
git commit -m "feat(computer-use): MCP stdio client for upstream binary"
Files:
packages/core/src/tools/computer-use/tool.tspackages/core/src/tools/computer-use/tool.test.tsFor this task, the tool just forwards to ComputerUseClient assuming it's already started. The bootstrap state machine wraps this in Phase 3.
Create packages/core/src/tools/computer-use/tool.test.ts:
import { describe, it, expect, beforeEach, vi } from 'vitest';
import { ComputerUseTool } from './tool.js';
import { ComputerUseClient } from './client.js';
import { COMPUTER_USE_SCHEMAS } from './schemas.js';
function makeFakeClient(
callToolImpl: (name: string, args: unknown) => Promise<unknown>,
) {
const fake = {
isStarted: () => true,
start: vi.fn(async () => {}),
callTool: vi.fn(callToolImpl),
stop: vi.fn(async () => {}),
};
return fake as unknown as ComputerUseClient;
}
describe('ComputerUseTool', () => {
beforeEach(() => {
ComputerUseClient.setSharedForTest(undefined);
});
it('exposes qwen-facing name with computer_use__ prefix', () => {
const tool = new ComputerUseTool('click', COMPUTER_USE_SCHEMAS.click);
expect(tool.name).toBe('computer_use__click');
expect(tool.displayName).toBe('computer_use__click');
});
it('marks itself as deferred', () => {
const tool = new ComputerUseTool(
'list_apps',
COMPUTER_USE_SCHEMAS.list_apps,
);
expect(tool.shouldDefer).toBe(true);
expect(tool.alwaysLoad).toBe(false);
});
it('forwards execute() to the shared client with the upstream name', async () => {
const fake = makeFakeClient(async () => ({
content: [{ type: 'text', text: '[]' }],
isError: false,
}));
ComputerUseClient.setSharedForTest(fake);
const tool = new ComputerUseTool(
'list_apps',
COMPUTER_USE_SCHEMAS.list_apps,
);
const invocation = tool.build({});
const result = await invocation.execute(new AbortController().signal);
expect(result.error).toBeUndefined();
expect(fake.callTool).toHaveBeenCalledWith('list_apps', {});
});
it('returns an error result when client returns isError=true', async () => {
const fake = makeFakeClient(async () => ({
content: [{ type: 'text', text: 'something went wrong' }],
isError: true,
}));
ComputerUseClient.setSharedForTest(fake);
const tool = new ComputerUseTool('click', COMPUTER_USE_SCHEMAS.click);
const invocation = tool.build({ app: 'TextEdit' });
const result = await invocation.execute(new AbortController().signal);
expect(result.error).toBeDefined();
expect(String(result.llmContent)).toContain('something went wrong');
});
});
Run: npm test -- packages/core/src/tools/computer-use/tool.test.ts
Expected: FAIL — module not found
Create packages/core/src/tools/computer-use/tool.ts:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import {
BaseDeclarativeTool,
BaseToolInvocation,
Kind,
type ToolInvocation,
type ToolResult,
} from '../tools.js';
import type { CallToolResult } from '@modelcontextprotocol/sdk/types.js';
import { ComputerUseClient } from './client.js';
import type { ComputerUseToolName, ComputerUseToolSchema } from './schemas.js';
import { safeJsonStringify } from '../../utils/safeJsonStringify.js';
import { runBootstrap } from './bootstrap.js';
type ComputerUseParams = Record<string, unknown>;
class ComputerUseInvocation extends BaseToolInvocation<
ComputerUseParams,
ToolResult
> {
constructor(
private readonly upstreamName: ComputerUseToolName,
params: ComputerUseParams,
) {
super(params);
}
getDescription(): string {
return safeJsonStringify(this.params);
}
async execute(
signal: AbortSignal,
updateOutput?: (output: string) => void,
): Promise<ToolResult> {
const client = ComputerUseClient.shared();
// Phase 3 wires the bootstrap state machine here. Until then, this
// shells out directly which is fine when the binary is already
// installed and permissions granted.
await runBootstrap(client, { signal, updateOutput });
let mcpResult: CallToolResult;
try {
mcpResult = await client.callTool(this.upstreamName, this.params);
} catch (err) {
const message = err instanceof Error ? err.message : String(err);
return {
llmContent: `Computer Use tool '${this.upstreamName}' failed: ${message}`,
returnDisplay: `Error: ${message}`,
error: { message },
};
}
const text = mcpResult.content
.map((part) => (part.type === 'text' ? part.text : ''))
.filter(Boolean)
.join('\n');
if (mcpResult.isError) {
return {
llmContent: text || `Tool '${this.upstreamName}' returned isError=true`,
returnDisplay: text || 'Error',
error: { message: text || 'tool returned error' },
};
}
return {
llmContent: text,
returnDisplay: text,
};
}
}
export class ComputerUseTool extends BaseDeclarativeTool<
ComputerUseParams,
ToolResult
> {
constructor(
private readonly upstreamName: ComputerUseToolName,
schema: ComputerUseToolSchema,
) {
const qwenName = `computer_use__${upstreamName}`;
super(
qwenName,
qwenName, // displayName == name; no MCP branding in UI
schema.description,
Kind.Other,
schema.parameterSchema,
true, // isOutputMarkdown — many results are JSON-ish text or screenshots
true, // canUpdateOutput — bootstrap streams progress
true, // shouldDefer — surface only via ToolSearch
false, // alwaysLoad
`computer use desktop click type screenshot mouse keyboard scroll drag automation gui app native`,
);
}
protected createInvocation(
params: ComputerUseParams,
): ToolInvocation<ComputerUseParams, ToolResult> {
return new ComputerUseInvocation(this.upstreamName, params);
}
}
Note: the test references runBootstrap which is implemented in Phase 3. For now, create a stub bootstrap.ts so the test passes:
Create packages/core/src/tools/computer-use/bootstrap.ts:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import type { ComputerUseClient } from './client.js';
export interface BootstrapContext {
signal: AbortSignal;
updateOutput?: (output: string) => void;
}
/**
* STUB: Phase 3 replaces this with the full state machine
* (install confirm → install → permission probe → guide → poll).
* For now: assumes binary is installed and permissions granted;
* just starts the client if needed.
*/
export async function runBootstrap(
client: ComputerUseClient,
_ctx: BootstrapContext,
): Promise<void> {
if (!client.isStarted()) {
await client.start();
}
}
Run: npm test -- packages/core/src/tools/computer-use/tool.test.ts
Expected: PASS, 4 tests
git add packages/core/src/tools/computer-use/tool.ts packages/core/src/tools/computer-use/tool.test.ts packages/core/src/tools/computer-use/bootstrap.ts
git commit -m "feat(computer-use): ComputerUseTool wrapper + bootstrap stub"
Files:
Create: packages/core/src/tools/computer-use/index.ts
Modify: packages/core/src/config/config.ts
Step 1: Create the registration helper
Create packages/core/src/tools/computer-use/index.ts:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
export { ComputerUseTool } from './tool.js';
export { ComputerUseClient } from './client.js';
export type { ComputerUseToolName, ComputerUseToolSchema } from './schemas.js';
export { COMPUTER_USE_TOOL_NAMES, COMPUTER_USE_SCHEMAS } from './schemas.js';
import { ComputerUseTool } from './tool.js';
import { COMPUTER_USE_SCHEMAS, COMPUTER_USE_TOOL_NAMES } from './schemas.js';
import type { ToolRegistry } from '../tool-registry.js';
/**
* Register all 9 computer-use tools as lazy factories on the registry.
* Each tool is deferred (`shouldDefer=true`), so they surface only via
* ToolSearch keyword match. The first invocation triggers the
* bootstrap state machine (install confirm → install → permission flow)
* before forwarding to the upstream MCP server.
*
* Should only be called when `Config.isComputerUseEnabled()` is true.
*/
export function registerComputerUseTools(registry: ToolRegistry): void {
for (const upstreamName of COMPUTER_USE_TOOL_NAMES) {
const schema = COMPUTER_USE_SCHEMAS[upstreamName];
const qwenName = `computer_use__${upstreamName}`;
registry.registerFactory(
qwenName,
async () => new ComputerUseTool(upstreamName, schema),
);
}
}
Edit packages/core/src/config/config.ts. Find the existing block that registers cron tools conditionally (around line 3952):
if (this.isCronEnabled()) {
await registerLazy(ToolNames.CRON_CREATE, async () => { ... });
...
}
Directly below the cron block (and before the monitor block), add:
// Register computer-use tools unless disabled.
// All 9 are deferred — they surface only via ToolSearch keyword
// match (see packages/core/src/tools/computer-use/).
if (this.isComputerUseEnabled()) {
const { registerComputerUseTools } = await import(
'../tools/computer-use/index.js'
);
registerComputerUseTools(registry);
}
Append to the existing tool-registry tests OR create packages/core/src/tools/computer-use/registration.test.ts:
import { describe, it, expect, vi } from 'vitest';
import { registerComputerUseTools } from './index.js';
import { COMPUTER_USE_TOOL_NAMES } from './schemas.js';
describe('registerComputerUseTools', () => {
it('registers a factory for each of the 9 upstream tools, prefixed with computer_use__', () => {
const registered = new Set<string>();
const fakeRegistry = {
registerFactory: vi.fn((name: string) => {
registered.add(name);
}),
} as never;
registerComputerUseTools(fakeRegistry);
expect(registered.size).toBe(9);
for (const name of COMPUTER_USE_TOOL_NAMES) {
expect(registered.has(`computer_use__${name}`)).toBe(true);
}
});
});
Run:
npm test -- packages/core/src/tools/computer-use/
npm run build -- --filter @qwen-code/qwen-code-core
Expected: All PASS.
git add packages/core/src/tools/computer-use/index.ts packages/core/src/tools/computer-use/registration.test.ts packages/core/src/config/config.ts
git commit -m "feat(computer-use): register 9 deferred tools when enabled"
This is a non-coding gate. Verifies the foundation works before piling on the bootstrap UX.
Run in a terminal:
npx -y open-computer-use@latest --version
On macOS: also run npx -y open-computer-use@latest doctor and grant any prompted permissions. This bypasses our bootstrap so we can verify the transport layer in isolation.
Run: npm run build
Expected: PASS.
Start qwen-code, then ask the model: "Use the ToolSearch tool with query 'click computer use' to find any desktop automation tools available."
Expected: ToolSearch returns 9 computer_use__* schemas.
Ask: "List the desktop apps currently running using the computer_use__list_apps tool."
Expected: First call has a few seconds of "Starting Computer Use..." (or longer if npx cache is cold), then returns a list of running apps. Subsequent calls in the same session are fast.
If anything fails here, STOP and debug before moving to Phase 3.
This phase replaces the runBootstrap stub from Task 5 with the full state machine.
Files:
packages/core/src/tools/computer-use/install-state.tspackages/core/src/tools/computer-use/install-state.test.tsPersisted at ~/.qwen/computer-use/installed.json:
{
"approvedPackageSpec": "open-computer-use@^0.3.0",
"approvedAtIso": "2026-05-28T10:00:00Z"
}
Create packages/core/src/tools/computer-use/install-state.test.ts:
import { describe, it, expect, beforeEach, afterEach } from 'vitest';
import {
loadInstallState,
saveInstallState,
isPackageSpecApproved,
installStatePathFor,
} from './install-state.js';
import { mkdtempSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
describe('install-state', () => {
let tmpHome: string;
beforeEach(() => {
tmpHome = mkdtempSync(join(tmpdir(), 'qwen-cu-test-'));
});
afterEach(() => {
rmSync(tmpHome, { recursive: true, force: true });
});
it('returns undefined when no state file exists', async () => {
expect(await loadInstallState(tmpHome)).toBeUndefined();
});
it('round-trips state', async () => {
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
const loaded = await loadInstallState(tmpHome);
expect(loaded).toEqual({
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
});
it('isPackageSpecApproved returns false when no state', async () => {
expect(
await isPackageSpecApproved(tmpHome, 'open-computer-use@^0.3.0'),
).toBe(false);
});
it('isPackageSpecApproved returns true on exact match', async () => {
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
expect(
await isPackageSpecApproved(tmpHome, 'open-computer-use@^0.3.0'),
).toBe(true);
});
it('isPackageSpecApproved returns false when version differs', async () => {
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
expect(
await isPackageSpecApproved(tmpHome, 'open-computer-use@^0.4.0'),
).toBe(false);
});
});
Run: npm test -- packages/core/src/tools/computer-use/install-state.test.ts
Expected: FAIL — module not found
Create packages/core/src/tools/computer-use/install-state.ts:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import { readFile, writeFile, mkdir } from 'node:fs/promises';
import { homedir } from 'node:os';
import { join, dirname } from 'node:path';
export interface InstallState {
/** The package spec the user approved (e.g. "open-computer-use@^0.3.0"). */
approvedPackageSpec: string;
/** ISO 8601 UTC timestamp of approval. */
approvedAtIso: string;
}
/**
* Path to the install-state file. Exported for tests so they can
* point at a temp directory.
*/
export function installStatePathFor(home: string = homedir()): string {
return join(home, '.qwen', 'computer-use', 'installed.json');
}
export async function loadInstallState(
home: string = homedir(),
): Promise<InstallState | undefined> {
try {
const text = await readFile(installStatePathFor(home), 'utf8');
const parsed = JSON.parse(text) as InstallState;
// Minimal shape check — older or malformed files act as "not approved".
if (typeof parsed?.approvedPackageSpec !== 'string') return undefined;
if (typeof parsed?.approvedAtIso !== 'string') return undefined;
return parsed;
} catch (err) {
if ((err as NodeJS.ErrnoException)?.code === 'ENOENT') return undefined;
// Treat unreadable / malformed state as "not approved" — re-prompt
// is safe; treating a bad file as approved would silently install.
return undefined;
}
}
export async function saveInstallState(
home: string = homedir(),
state: InstallState,
): Promise<void> {
const path = installStatePathFor(home);
await mkdir(dirname(path), { recursive: true });
await writeFile(path, JSON.stringify(state, null, 2), 'utf8');
}
/**
* True iff the persisted state's package spec exactly matches the one
* we're about to install. Different specs (version pin bumps) require
* re-approval, since the user may have approved an older / smaller /
* different-license version.
*/
export async function isPackageSpecApproved(
home: string = homedir(),
packageSpec: string,
): Promise<boolean> {
const state = await loadInstallState(home);
return state?.approvedPackageSpec === packageSpec;
}
Run: npm test -- packages/core/src/tools/computer-use/install-state.test.ts
Expected: PASS, 5 tests
git add packages/core/src/tools/computer-use/install-state.ts packages/core/src/tools/computer-use/install-state.test.ts
git commit -m "feat(computer-use): persist install approval state under ~/.qwen"
Files:
Create: packages/core/src/tools/computer-use/permission-detector.ts
Create: packages/core/src/tools/computer-use/permission-detector.test.ts
Step 1: Write the failing test
Create packages/core/src/tools/computer-use/permission-detector.test.ts:
import { describe, it, expect } from 'vitest';
import { detectPermissionError } from './permission-detector.js';
import type { CallToolResult } from '@modelcontextprotocol/sdk/types.js';
function textErrorResult(text: string): CallToolResult {
return {
content: [{ type: 'text', text }],
isError: true,
};
}
describe('detectPermissionError', () => {
it('returns "none" when isError is false', () => {
expect(
detectPermissionError({
content: [{ type: 'text', text: 'ok' }],
isError: false,
}),
).toBe('none');
});
it('detects accessibility permission missing (upstream phrasing)', () => {
// From AccessibilitySnapshot.swift:104
const result = textErrorResult(
'Accessibility permission is required. Run `open-computer-use doctor` and grant access to Open Computer Use.',
);
expect(detectPermissionError(result)).toBe('accessibility');
});
it('detects screen recording permission missing', () => {
const result = textErrorResult(
'Screen Recording permission is required to capture this window.',
);
expect(detectPermissionError(result)).toBe('screenRecording');
});
it('detects via the generic doctor marker as fallback', () => {
const result = textErrorResult(
'Some unfamiliar error. Run `open-computer-use doctor` for help.',
);
expect(detectPermissionError(result)).toBe('unknown_permission');
});
it('returns "other" for unrelated errors', () => {
expect(
detectPermissionError(textErrorResult('appNotFound("ImaginaryApp")')),
).toBe('other');
});
});
Run: npm test -- packages/core/src/tools/computer-use/permission-detector.test.ts
Expected: FAIL — module not found
Create packages/core/src/tools/computer-use/permission-detector.ts:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
import type { CallToolResult } from '@modelcontextprotocol/sdk/types.js';
/**
* What kind of permission issue, if any, the upstream MCP result
* indicates. We classify based on message strings because upstream
* doesn't expose typed error codes through MCP (see
* `packages/OpenComputerUseKit/Sources/OpenComputerUseKit/Errors.swift`
* in the open-codex-computer-use repo).
*
* Long-term fix is to PR upstream for a typed errorKind; for now this
* string detection is the contract.
*/
export type PermissionErrorKind =
| 'none' // success, or non-error result
| 'other' // error, but not a permission issue
| 'accessibility' // AX missing
| 'screenRecording' // Screen Recording missing
| 'unknown_permission'; // matches the doctor marker but doesn't pinpoint which
/**
* Upstream-known error patterns. Order matters — more specific
* patterns first.
*/
const PATTERNS: Array<{ kind: PermissionErrorKind; regex: RegExp }> = [
{ kind: 'accessibility', regex: /accessibility permission is required/i },
{ kind: 'screenRecording', regex: /screen recording permission/i },
// Fallback: any error mentioning the doctor command is likely permission-related.
// Listed last so it doesn't preempt the specific patterns.
{ kind: 'unknown_permission', regex: /open-computer-use\s+doctor/i },
];
export function detectPermissionError(
result: CallToolResult,
): PermissionErrorKind {
if (!result.isError) return 'none';
const text = result.content
.map((part) => (part.type === 'text' ? part.text : ''))
.join('\n');
for (const { kind, regex } of PATTERNS) {
if (regex.test(text)) return kind;
}
return 'other';
}
Run: npm test -- packages/core/src/tools/computer-use/permission-detector.test.ts
Expected: PASS, 5 tests
git add packages/core/src/tools/computer-use/permission-detector.ts packages/core/src/tools/computer-use/permission-detector.test.ts
git commit -m "feat(computer-use): detect upstream permission errors"
Files:
packages/core/src/tools/computer-use/bootstrap.ts (replace stub from Task 5)packages/core/src/tools/computer-use/bootstrap.test.tsThe state machine has three sub-flows:
isPackageSpecApproved is false, prompt the user, install, persist approval.open-computer-use doctor, poll for grant up to 10 min, retry.Note: the actual "ask user a question mid-execution" mechanic in qwen-code uses the existing tool-confirmation framework. IMPLEMENTER: before writing this task's implementation, grep for shouldConfirmExecute in packages/core/src/tools/ to see how shell.ts / similar do confirmation. This task assumes that mechanic is available; if it isn't, swap in process.stderr.write + read from process.stdin for the install confirm (acceptable v0 UX).
Run:
grep -rn "shouldConfirmExecute\|ToolConfirmation" packages/core/src/tools --include="*.ts" | grep -v ".test." | head -20
Read at least one tool that uses the confirmation pattern (likely shell.ts). Decide: does ToolInvocation have a shouldConfirmExecute() method or similar?
If YES: use it for the install confirm.
If NO: use the v0 fallback (stderr + ask_user_question tool if exposed, else throw a specific error code the model can re-issue after user grant).
Document your choice in a code comment at the top of bootstrap.ts.
Create packages/core/src/tools/computer-use/bootstrap.test.ts:
import { describe, it, expect, beforeEach, afterEach, vi } from 'vitest';
import { mkdtempSync, rmSync } from 'node:fs';
import { tmpdir } from 'node:os';
import { join } from 'node:path';
import { runBootstrap, type BootstrapDeps } from './bootstrap.js';
function makeFakeClient(opts: { startThrows?: Error } = {}) {
const start = vi.fn(async () => {
if (opts.startThrows) throw opts.startThrows;
});
return {
isStarted: vi.fn(() => start.mock.calls.length > 0),
start,
callTool: vi.fn(),
stop: vi.fn(),
};
}
describe('runBootstrap', () => {
let tmpHome: string;
let deps: BootstrapDeps;
beforeEach(() => {
tmpHome = mkdtempSync(join(tmpdir(), 'qwen-cu-bs-'));
deps = {
homeDir: tmpHome,
packageSpec: 'open-computer-use@^0.3.0',
platform: 'darwin',
promptInstallApproval: vi.fn(async () => true),
spawnDoctor: vi.fn(),
probePermissions: vi.fn(async () => 'ok' as const),
};
});
afterEach(() => {
rmSync(tmpHome, { recursive: true, force: true });
});
it('starts the client when binary is approved + permissions ok', async () => {
// Pre-seed install state to skip the prompt
const { saveInstallState } = await import('./install-state.js');
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
const client = makeFakeClient();
await runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
);
expect(client.start).toHaveBeenCalledOnce();
expect(deps.promptInstallApproval).not.toHaveBeenCalled();
});
it('prompts for install approval on first call', async () => {
const client = makeFakeClient();
await runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
);
expect(deps.promptInstallApproval).toHaveBeenCalledOnce();
expect(client.start).toHaveBeenCalledOnce();
});
it('throws when user declines install', async () => {
deps.promptInstallApproval = vi.fn(async () => false);
const client = makeFakeClient();
await expect(
runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
),
).rejects.toThrow(/declined/i);
expect(client.start).not.toHaveBeenCalled();
});
it('persists approval on success', async () => {
const client = makeFakeClient();
await runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
);
const { loadInstallState } = await import('./install-state.js');
const state = await loadInstallState(tmpHome);
expect(state?.approvedPackageSpec).toBe('open-computer-use@^0.3.0');
});
it('spawns doctor and polls when permissions are missing', async () => {
const { saveInstallState } = await import('./install-state.js');
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
let probeCount = 0;
deps.probePermissions = vi.fn(async () => {
probeCount++;
return probeCount < 3 ? 'accessibility' : 'ok';
});
deps.pollIntervalMs = 1; // speed up test
deps.pollTimeoutMs = 1000;
const client = makeFakeClient();
await runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
);
expect(deps.spawnDoctor).toHaveBeenCalledOnce();
expect(probeCount).toBeGreaterThanOrEqual(3);
});
it('throws after pollTimeoutMs when permissions never grant', async () => {
const { saveInstallState } = await import('./install-state.js');
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
deps.probePermissions = vi.fn(async () => 'accessibility' as const);
deps.pollIntervalMs = 1;
deps.pollTimeoutMs = 50;
const client = makeFakeClient();
await expect(
runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
),
).rejects.toThrow(/timed out/i);
});
it('skips permission flow on non-darwin platforms', async () => {
const { saveInstallState } = await import('./install-state.js');
await saveInstallState(tmpHome, {
approvedPackageSpec: 'open-computer-use@^0.3.0',
approvedAtIso: '2026-05-28T10:00:00Z',
});
deps.platform = 'linux';
const client = makeFakeClient();
await runBootstrap(
client as never,
{ signal: new AbortController().signal },
deps,
);
expect(deps.spawnDoctor).not.toHaveBeenCalled();
});
});
Run: npm test -- packages/core/src/tools/computer-use/bootstrap.test.ts
Expected: FAIL — many errors
Replace packages/core/src/tools/computer-use/bootstrap.ts with:
/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
/**
* Computer Use bootstrap state machine.
*
* On first invocation of any computer_use__* tool:
* 1. If not yet approved: prompt the user to install (one-time).
* 2. Start the client (lazy npx spawn, may take ~60s first time).
* 3. On macOS only: probe permissions by calling get_app_state on
* Finder. If a permission error surfaces, spawn the upstream
* doctor (which opens the system settings + onboarding window),
* then poll until permissions grant or 10 min timeout.
*
* IMPLEMENTER: pre-step 1 (Task 10 step 1) — verify whether
* qwen-code's BaseDeclarativeTool exposes a `shouldConfirmExecute()`
* pathway from inside `execute()`. If not, `promptInstallApproval`
* defaults to a `process.stderr.write` + readline fallback. The
* dependency-injection design here keeps that decision swappable
* without touching the state machine logic.
*/
import { spawn } from 'node:child_process';
import { homedir } from 'node:os';
import type { ComputerUseClient } from './client.js';
import { isPackageSpecApproved, saveInstallState } from './install-state.js';
import {
detectPermissionError,
type PermissionErrorKind,
} from './permission-detector.js';
export interface BootstrapContext {
signal: AbortSignal;
updateOutput?: (output: string) => void;
}
/** Result of a permission probe. */
export type PermissionProbeResult = 'ok' | PermissionErrorKind;
export interface BootstrapDeps {
homeDir: string;
packageSpec: string;
platform: NodeJS.Platform;
/**
* Prompt the user to approve installing the upstream binary. Returns
* true if approved. Implementation may use the qwen-code confirm
* tool path or a stdin fallback.
*/
promptInstallApproval: (packageSpec: string) => Promise<boolean>;
/**
* Spawn `open-computer-use doctor` (detached). The binary handles
* opening the system settings window itself.
*/
spawnDoctor: () => void;
/**
* Probe the upstream MCP server for permission state by issuing a
* lightweight tool call. Returns 'ok' on success or the kind of
* permission error on failure.
*/
probePermissions: (
client: ComputerUseClient,
) => Promise<PermissionProbeResult>;
/** Poll interval for the permission watcher. Default 2000ms. */
pollIntervalMs?: number;
/** Total poll timeout. Default 10 min. */
pollTimeoutMs?: number;
}
/** Production defaults — instantiated lazily so tests can override per call. */
function defaultDeps(): BootstrapDeps {
return {
homeDir: homedir(),
packageSpec:
process.env['QWEN_COMPUTER_USE_PACKAGE'] ?? 'open-computer-use@latest',
platform: process.platform,
promptInstallApproval: async (spec) => {
// v0 fallback: stderr prompt + stdin read. Replace with
// qwen-code's standard confirm pathway when wired in.
process.stderr.write(
`\n[Computer Use] First-time install\n` +
` Package: ${spec}\n` +
` This will fetch ~50MB from the npm registry the first time.\n` +
` Computer Use can click, type, and read your desktop apps.\n` +
` On macOS you'll be guided through Accessibility and Screen Recording permissions next.\n` +
`Proceed? [y/N] `,
);
// IMPLEMENTER: in real interactive sessions, replace with the
// qwen-code confirm system. For headless / SDK contexts the
// default is to refuse — explicit user opt-in required.
return process.env['QWEN_COMPUTER_USE_AUTO_APPROVE'] === '1';
},
spawnDoctor: () => {
const child = spawn('npx', ['-y', defaultDeps().packageSpec, 'doctor'], {
detached: true,
stdio: 'ignore',
});
child.unref();
},
probePermissions: async (client) => {
// Use Finder as a known-running, always-installed macOS app.
// get_app_state hits AccessibilitySnapshot which is the first
// path that throws permissionDenied.
const result = await client.callTool('get_app_state', { app: 'Finder' });
return detectPermissionError(result) === 'none'
? 'ok'
: detectPermissionError(result);
},
};
}
export async function runBootstrap(
client: ComputerUseClient,
ctx: BootstrapContext,
depsOverride?: Partial<BootstrapDeps>,
): Promise<void> {
const deps: BootstrapDeps = { ...defaultDeps(), ...depsOverride };
const pollIntervalMs = deps.pollIntervalMs ?? 2000;
const pollTimeoutMs = deps.pollTimeoutMs ?? 10 * 60_000;
// Step 1: install approval gate.
const approved = await isPackageSpecApproved(deps.homeDir, deps.packageSpec);
if (!approved) {
ctx.updateOutput?.('Computer Use needs to be installed (first use).');
const ok = await deps.promptInstallApproval(deps.packageSpec);
if (!ok) {
throw new Error(
`Computer Use install declined by user. Re-invoke the tool to be prompted again.`,
);
}
await saveInstallState(deps.homeDir, {
approvedPackageSpec: deps.packageSpec,
approvedAtIso: new Date().toISOString(),
});
}
// Step 2: spawn (idempotent).
if (!client.isStarted()) {
ctx.updateOutput?.('Starting Computer Use...');
await client.start();
}
// Step 3: macOS permission probe + guide.
if (deps.platform !== 'darwin') return;
const probe = await deps.probePermissions(client);
if (probe === 'ok' || probe === 'other') {
// 'other' means an error happened that isn't permission-related.
// We don't block bootstrap on that — let the actual tool call surface it.
return;
}
ctx.updateOutput?.(
`Computer Use needs macOS permissions (${probe}). ` +
`An onboarding window will open — please grant Accessibility and Screen Recording, then this will continue automatically.`,
);
deps.spawnDoctor();
const startedAt = Date.now();
for (;;) {
if (ctx.signal.aborted) {
throw new Error('Computer Use bootstrap aborted.');
}
if (Date.now() - startedAt > pollTimeoutMs) {
throw new Error(
`Computer Use permission grant timed out after ${Math.round(pollTimeoutMs / 1000)}s. Re-invoke the tool to retry.`,
);
}
await new Promise((resolve) => setTimeout(resolve, pollIntervalMs));
const next = await deps.probePermissions(client);
if (next === 'ok' || next === 'other') return;
const elapsedSec = Math.round((Date.now() - startedAt) / 1000);
ctx.updateOutput?.(`Waiting for permissions... (${elapsedSec}s)`);
}
}
Run: npm test -- packages/core/src/tools/computer-use/bootstrap.test.ts
Expected: PASS, 7 tests
git add packages/core/src/tools/computer-use/bootstrap.ts packages/core/src/tools/computer-use/bootstrap.test.ts
git commit -m "feat(computer-use): bootstrap state machine (install + permissions)"
promptInstallApproval to qwen-code's confirm systemFiles:
packages/core/src/tools/computer-use/bootstrap.tspackages/core/src/tools/computer-use/tool.tsThis is the task with the most variable scope. IMPLEMENTER: read the investigation result from Task 10 step 1 and wire accordingly. Two scenarios:
Scenario A — BaseToolInvocation supports shouldConfirmExecute():
shouldConfirmExecute() in ComputerUseInvocation to return the install-confirm payload when the package isn't yet approved.execute() proceeds.bootstrap.ts then only handles the post-confirm path (write state, start, permission probe).Scenario B — no in-execute confirm pathway:
Keep the stderr+stdin v0 from Task 10. Document loudly in the README and SKILL.md.
File a follow-up task to add a proper confirm pathway (separate PR).
Step 1: Implement chosen scenario
(Concrete code depends on the investigation; defer detail to implementer.)
Wipe install state:
rm -rf ~/.qwen/computer-use
Launch qwen-code and ask a computer-use question. Confirm the install prompt appears in the chosen UX (confirm dialog or stderr) and that approving it persists state correctly.
git add -A
git commit -m "feat(computer-use): wire install approval to qwen-code confirm UX"
This is a non-coding gate.
rm -rf ~/.qwen/computer-use
rm -rf ~/.npm/_npx
# macOS: revoke permissions
# System Settings → Privacy & Security → Accessibility / Screen Recording
# remove "Open Computer Use.app"
npm run build
# launch qwen-code, ask a computer-use question
Expected sequence:
updateOutput.If any step fails, capture the error and stop. Iterate.
Files:
scripts/sync-computer-use-schemas.tsRuns as part of qwen-code release prep. Spawns npx -y open-computer-use@<pin> mcp, sends tools/list, regenerates schemas.ts.
Create scripts/sync-computer-use-schemas.ts:
#!/usr/bin/env tsx
/**
* Regenerate packages/core/src/tools/computer-use/schemas.ts from a
* live upstream open-computer-use MCP server.
*
* Usage:
* npx tsx scripts/sync-computer-use-schemas.ts [packageSpec]
*
* Defaults packageSpec to `open-computer-use@latest`. The pin written
* into the generated file is whatever spec was used — pass an explicit
* pin (e.g. `[email protected]`) for release builds.
*/
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { StdioClientTransport } from '@modelcontextprotocol/sdk/client/stdio.js';
import { writeFile } from 'node:fs/promises';
import { resolve } from 'node:path';
async function main(): Promise<void> {
const packageSpec = process.argv[2] ?? 'open-computer-use@latest';
const transport = new StdioClientTransport({
command: 'npx',
args: ['-y', packageSpec, 'mcp'],
});
const client = new Client(
{ name: 'qwen-code-schema-sync', version: '1.0.0' },
{ capabilities: {} },
);
await client.connect(transport);
const result = await client.listTools();
await client.close();
if (result.tools.length !== 9) {
process.stderr.write(
`WARNING: upstream returned ${result.tools.length} tools, expected 9. Continuing anyway.\n`,
);
}
const schemas: Record<
string,
{ description: string; parameterSchema: unknown }
> = {};
for (const tool of result.tools) {
schemas[tool.name] = {
description: tool.description ?? '',
parameterSchema: tool.inputSchema ?? { type: 'object', properties: {} },
};
}
const out = `/**
* @license
* Copyright 2025 Qwen Team
* SPDX-License-Identifier: Apache-2.0
*/
/**
* Hardcoded schemas for the upstream open-computer-use tools.
*
* Pinned to upstream: ${packageSpec}
* Regenerated by scripts/sync-computer-use-schemas.ts — do not hand-edit.
*/
export interface ComputerUseToolSchema {
description: string;
parameterSchema: Record<string, unknown>;
}
export const COMPUTER_USE_TOOL_NAMES = ${JSON.stringify(
result.tools.map((t) => t.name),
null,
2,
)} as const;
export type ComputerUseToolName = (typeof COMPUTER_USE_TOOL_NAMES)[number];
export const COMPUTER_USE_SCHEMAS: Record<ComputerUseToolName, ComputerUseToolSchema> = ${JSON.stringify(
schemas,
null,
2,
)};
`;
const target = resolve('packages/core/src/tools/computer-use/schemas.ts');
await writeFile(target, out, 'utf8');
process.stdout.write(`Wrote ${result.tools.length} schemas to ${target}\n`);
}
main().catch((err) => {
process.stderr.write(`Schema sync failed: ${err}\n`);
process.exit(1);
});
npx tsx scripts/sync-computer-use-schemas.ts open-computer-use@latest
Expected: schemas.ts is rewritten; npm test -- packages/core/src/tools/computer-use/schemas.test.ts still passes (or fails only on tests that asserted specific hand-written content — adjust those tests if upstream descriptions changed).
git add scripts/sync-computer-use-schemas.ts packages/core/src/tools/computer-use/schemas.ts
git commit -m "chore(computer-use): script to sync schemas from upstream"
computer_use__ prefix consistently across schemas, tool wrapper, and registration.enableComputerUse defaults to true per the user's decision.list_apps / get_app_state without confirming every call).Plan saved to docs/superpowers/plans/2026-05-28-computer-use-built-in.md.
Two execution options:
Which approach?