v3/docs/adr/ADR-147-copilot-sdk-adapter.md
Status: Implemented (skeleton + P1 phase) Date: 2026-06-03 Issue: research/copilot-sdk-ruflo-integration (branch); follow-up issue to be filed at PR time Related: ADR-001 (Deep agentic-flow integration), ADR-026 (3-Tier Model Routing), ADR-131 (ToolOutputGuardrail), ADR-143 (Codemod vs Booster), ADR-144 (Authorization Propagation), ADR-145 (Plugin Supply-Chain Integrity), ADR-146 (ToolOutputGuardrail Rollout)
GitHub announced general availability of the Copilot SDK (@github/copilot-sdk) on 2026-06-02. The SDK exposes the same agentic runtime that powers the Copilot app — planning, tool invocation, streaming, multi-turn sessions, MCP server registration, and custom tool calling — directly to any application or CLI, without requiring a separate orchestration layer. It is MIT-licensed, runs on Node.js 20+, and supports six target languages including TypeScript. Reference: Copilot SDK research dossier.
This creates a strategic opportunity for RuFlo. The repository already ships @claude-flow/codex — an OpenAI Codex CLI platform adapter that proves the modular extension pattern: it adds a second programmable agent platform (alongside the existing Claude Code adapter) via a self-contained package, a DualModeOrchestrator, a /loop runner, generators, validators, and migrations. The Copilot SDK is the natural third platform.
Adding Copilot is not a single file change:
| Surface | New work |
|---|---|
| Programmable platform count | 2 → 3 (Claude, Codex, Copilot) |
| Worker prefix vocabulary | claude: / codex: → claude: / codex: / copilot: (breaking-ish — WorkerConfig.platform type widens) |
| Authentication shape | Subprocess CLI (codex) and stdin/stdout (claude) → JSON-RPC client-session model with device-flow OAuth via the bundled Copilot CLI process |
| Model catalog | GPT-5.3-Codex (LTS), GPT-5.4-mini, GPT-5.5 — all require explicit Tier-2/Tier-3 mapping in ADR-026's routing table |
| MCP integration | One-shot codex mcp add ruflo → bidirectional MCP bridge (mcpServers map in every session config) |
| Governance plane | Same pre-task → route → call → post-task lifecycle, but the actual model call now happens via an SDK client object rather than a child process — the compile/enforce/prove/evolve hooks need new wiring |
Five independent PRs (one per phase) would relitigate the worker-prefix decision, the model-tier mapping, the auth contract, and the MCP bridge shape in each review. An ADR is the cheapest way to lock the contract once.
3.0.0-alpha.9, is published, has live users (the claude-flow-codex bin), and has set the structural template. Adopting Copilot as a peer adapter has near-zero ambiguity — the surface to mirror is already in tree.3.8.0) is the version this should ship under, per the MEMORY.md "stable semver, no alpha" rule.| Codex adapter file | Copilot adapter equivalent |
|---|---|
src/initializer.ts (CodexInitializer) | src/initializer.ts (CopilotInitializer) — same shape, replace codex mcp add with SDK session registration |
src/cli.ts (commander, subcommands) | src/cli.ts — add copilot auth, copilot mcp register subcommands |
src/dual-mode/orchestrator.ts (DualModeOrchestrator) | src/dual-mode/orchestrator.ts — extend, not rename, into MultiModeOrchestrator |
src/loop/index.ts (runCodexLoop) | src/loop/index.ts (runCopilotLoop) — swap codex exec subprocess for SDK session.sendAndWait() |
src/generators/{agents-md,config-toml,skill-md}.ts | Same generator names; output is Copilot-flavored (AGENTS.md content is portable, config.toml model defaults differ, skill format is identical) |
src/migrations/index.ts (migrateFromClaudeCode) | Same + new migrateFromCodex() per research §8.2 |
src/validators/index.ts | Identical (AGENTS.md + skill.md format are unchanged; config.toml validators get one new model-id check) |
src/templates/index.ts | Identical built-in skills list |
The only net-new directories are src/client/ (SDK wrapper) and src/mcp/ (bidirectional bridge). Everything else is structural mirroring.
Ship @claude-flow/copilot as a new v3 monorepo package at v3/@claude-flow/copilot/, mirroring @claude-flow/codex exactly, with two additive subfolders for SDK and MCP wiring, plus a MultiModeOrchestrator that extends DualModeOrchestrator into a three-platform collaboration runtime.
Path: v3/@claude-flow/copilot/. Bin: claude-flow-copilot. Version: 3.8.0 (stable per MEMORY rule, MINOR bump from current 3.7.x series). Dist-tags latest, alpha, v3alpha all point at the same version for legacy compat.
Exports mirror codex with two additions:
"./client": "./dist/client/index.js",
"./mcp": "./dist/mcp/index.js"
Engines: node >= 20 (Copilot SDK requirement; codex was >= 18).
Dependencies include @github/copilot-sdk@^1.0.0 (exact version is 1.0.0-beta.12 per the npm search probe in research §3.2, marked [unverified-exact] until the registry is accessible from the build host — pinned to caret-major). Same commander/fs-extra/chalk/inquirer/yaml/toml lineup as codex. Peer deps @claude-flow/cli and @claude-flow/guidance (both optional).
src/client/ (new — SDK wrapper)Four files, all sub-500-line:
auth.ts — resolveCredential() verifies the Copilot CLI is authenticated by exec'ing gh auth status (or checking COPILOT_GITHUB_TOKEN / GH_TOKEN env). The SDK reads credentials itself from env vars or stored OAuth — this wrapper never reads the raw token. Token handle (literally the string "gh-auth" or the env-var name that resolved) is cached at ~/.config/ruflo/copilot/token.json (chmod 600), or in-memory if the cache file is unavailable. clearCachedToken() removes the handle. The handle is not the token; the absolute rule from MEMORY ("never expose the user's API keys") means even the cache file holds only a source identifier, never the credential value.chat.ts — CopilotClient wraps @github/copilot-sdk's client-session model. Exposes createSession({ model, mcpServers, streaming, permissionHandler }), sendAndWait({ prompt }), stop(). The runGoverned(prompt, taskId) method is the canonical entry: it fires pre-task → route → compile (guidance) → SDK call → post-task → finalizeRun (guidance) in order, exactly as the codex adapter does for codex exec.tools.ts — defineCopilotTool(name, schema, handler) registers a callable tool against the SDK's tool-calling protocol; CopilotToolRegistry aggregates them for createSession({ tools }).models.ts — COPILOT_MODEL_CATALOG const with the three confirmed GA OpenAI models from research §4:
gpt-5.3-codex — tier 3, 1× multiplier, LTS through 2027-02-04, default Tier 3 codergpt-5.4-mini — tier 2, 0.33× multiplier, default Tier 2 fastgpt-5.5 — tier 3, 7.5× multiplier, Tier 3 frontier reasoning (opt-in)
Plus getOptimalModel(complexity, allowFrontier) returning the model ID by tier or by 0–100 complexity score. Other model strings from the broader catalog (gpt-4.1, gpt-5.2-codex, gpt-5.4-nano, etc.) are listed in a RETIRING_MODELS const for migration warnings only — they MUST NOT be selected by the router.src/mcp/ (new — bidirectional bridge)register.ts — registerRufloMcpWithCopilot(projectPath, toolFilter) returns the mcpServers object shape required by createSession({ mcpServers: { ruflo: { type: 'local', command: 'npx', args: [...], tools: ['*'] } } }). Defaults: type: 'local', command: 'npx', args: ['-y', 'ruflo@latest', 'mcp', 'start'], tools: ['*'], env: { CLAUDE_FLOW_CONFIG }. This is the path that lets Copilot agents call memory_store, swarm_init, hooks_route, etc. during a session.bridge.ts — CopilotMcpBridge: per-session bookkeeping so the orchestrator can report what MCP tools Copilot called during a run, feeding the same telemetry sink ADR-146 P5 specifies. Read-only at v1; the audit consumes it.MultiModeOrchestrator (extends DualModeOrchestrator)The codex DualModeOrchestrator's WorkerConfig.platform is 'claude' | 'codex'. Renaming would break every importer. Instead:
// in @claude-flow/copilot/src/dual-mode/orchestrator.ts
import { DualModeOrchestrator, WorkerConfig as BaseConfig } from '@claude-flow/codex/dual-mode';
export interface MultiModeWorkerConfig extends Omit<BaseConfig, 'platform'> {
platform: 'claude' | 'codex' | 'copilot';
copilotModel?: string;
copilotOptions?: { permissionHandler?: 'approve-all' | 'deny-all' | 'custom'; streaming?: boolean };
}
export class MultiModeOrchestrator extends DualModeOrchestrator {
protected async executeHeadless(config: MultiModeWorkerConfig): Promise<string> {
if (config.platform === 'copilot') return this.executeCopilotHeadless(config);
return super.executeHeadless(config as BaseConfig);
}
}
executeCopilotHeadless builds the collaborative prompt the same way the parent class does (memory bridge, role context, namespace), calls CopilotClient.runGoverned() instead of spawning a subprocess, and returns the response. The dependency-level execution graph (parent's buildDependencyLevels), the shared collaboration memory namespace, and the CollaborationTemplates machinery are inherited unchanged.
TriModeCollaborationTemplates (Copilot package only) adds featureDevelopment (claude→codex→copilot→claude pipeline) and securityAudit (copilot/gpt-5.5 scanner → codex fixer), per research §6.4.
/loop runnerrunCopilotLoop() mirrors runCodexLoop() exactly, with one substitution: instead of runCodexExec() spawning codex exec, the loop iteration calls CopilotClient.runGoverned(buildCopilotLoopPrompt(state), state.iteration). State is persisted at .copilot/loop/<name>.json. Stop sentinel is <name>.stop. Complete sentinel is <name>.complete. The state schema gets mode: 'copilot' | 'command' (parent codex schema was 'codex' | 'command').
Every runGoverned call fires the four hooks from CLAUDE.md (pre-task, route, post-task) by exec'ing npx @claude-flow/cli@latest hooks <name> ... exactly as the codex adapter does. The route hook may return [CODEMOD_AVAILABLE] (deterministic Tier-1, bypass LLM entirely per ADR-143) or [TASK_MODEL_RECOMMENDATION] Use model="gpt-5.4-mini" — in the latter case the orchestrator overrides config.copilotModel for that single call. Cost-tracking happens automatically via the existing post-task plumbing; the Copilot call's model + token estimate is stored to the cost-tracking namespace.
The MEMORY rule is absolute: never print, persist, log, or echo the raw GitHub token. The implementation MUST:
COPILOT_GITHUB_TOKEN, GH_TOKEN, GITHUB_TOKEN, or the stored gh auth login OAuth. Our code never sees the value.gh auth status (exit code 0 ↔ authenticated). Capture only the exit code, not stdout."env:COPILOT_GITHUB_TOKEN" or "gh-cli") at ~/.config/ruflo/copilot/token.json, chmod 600..copilot/ inside a project (gitignore is not a defense). Use the per-user ~/.config location only.COPILOT_AUTH_REQUIRED) with the action the user must take (gh auth login or export GITHUB_TOKEN=...) — never the token value.| Phase | Scope | Where |
|---|---|---|
| P1 | Package skeleton + src/client/models.ts + src/client/auth.ts + unit tests | v3/@claude-flow/copilot/ |
| P2 | src/client/chat.ts + runGoverned lifecycle | same |
| P3 | src/mcp/register.ts + bridge.ts + register-with-Copilot integration test | same |
| P4 | MultiModeOrchestrator + TriModeCollaborationTemplates + dual-mode dependency-level tests | same |
| P5 | CopilotInitializer + --copilot flag in @claude-flow/cli's init command; migrations + validators | v3/@claude-flow/cli, this package |
P2–P5 are wired behind the same CLAUDE_FLOW_STRICT_* env-var pattern ADR-144 introduced; legacy projects without @github/copilot-sdk installed simply skip Copilot worker spawns with a structured warning.
Extend @claude-flow/codex in-place. Adds a third platform to the same package. Cheaper in lines of code but breaks the modular naming (@claude-flow/codex is no longer Codex-only) and forces every existing claude-flow-codex bin user to pull in @github/copilot-sdk's ~25MB dependency tree. The codex package's published cost should stay proportional to "Codex CLI integration". A separate @claude-flow/copilot keeps the dependency footprint segmented.
Pure MCP integration — register a ruflo MCP server with Copilot, ship no SDK code. Defensible. Users running Copilot already get RuFlo tools via the existing npx ruflo mcp start. But the governance value (compile/enforce/prove/evolve wrapping every Copilot model call) requires a code path inside the orchestrator that knows when a Copilot call is happening. MCP-only means the model call is opaque to RuFlo; we can only audit what tools Copilot decided to invoke, not what model produced what reasoning. The MultiModeOrchestrator route is the only one that gives full provenance.
Cloud-only via GitHub Actions. Run RuFlo+Copilot collaboration as a GHA workflow with the Copilot Agent service. Loses the local-first MCP server architecture, breaks the dual run developer-loop UX, and introduces a per-org admin-approval dependency. Useful future addition; not a substitute for the local adapter.
Wait for a published @github/[email protected] (non-beta). The research probe found 1.0.0-beta.12 despite the GA announcement. Pinning to ^1.0.0 accepts any minor; if the registry returns 404 in CI, the package's optional-peer pattern (existing codex precedent) lets us ship the scaffold + tests now and tighten the version in a follow-up PR once the exact registry version is confirmed.
Positive:
dual run developer workflow — three platforms cross-validating in one shared memory namespace.Negative / risks:
@github/copilot-sdk whose published version conflicts with the GA announcement. Pinning to ^1.0.0 is the safe move; build fails if the registry only has the beta version need to be caught by a CI sanity check.gh auth status exit-code check is the auth gate; on a machine without the gh CLI installed at all, the user gets COPILOT_AUTH_REQUIRED with no actionable hint about installing gh first. Doctor command MUST flag this.MultiModeOrchestrator depends on @claude-flow/codex (peer dep) at runtime. If the user imports @claude-flow/copilot/dual-mode without codex installed, the import throws. Codex must be a runtime dep (not peer) for the dual-mode path, OR the orchestrator must dynamically import and fall back to a single-platform mode.copilot: worker prefix to the existing dual-mode CLI changes the spec parser. The codex package's parseWorkerSpecs() must be extended (or shadowed in the Copilot package) to accept three platforms.Deferred:
COPILOT_PROVIDER_* env vars; not wired in the v1 adapter. P6 enhancement.P1 lands with:
getOptimalModel(complexity=10) returns gpt-5.4-mini; getOptimalModel(complexity=85, allowFrontier=true) returns gpt-5.5.resolveCredential() returns null when no env var is set and gh auth status exits non-zero; returns the source identifier (never the token) when authenticated.v3/@claude-flow/copilot/src/ ≤ 500 lines; grep -rE 'ghp_|gho_|github_pat_|sk-' src/ returns empty.tsc --noEmit clean against the exact tsconfig copied from codex.P2–P5 each ship with their own smoke test against the mocked SDK (tests/__mocks__/@github/copilot-sdk.ts) plus an opt-in E2E run gated behind COPILOT_E2E=1 env var (consumes real AI credits — runs only in nightly CI).
@claude-flow/codex package — the structural template this ADR mirrors.@claude-flow/guidance control plane — the four-verb governance API (compile, enforce, prove, evolve).These do not block ADR acceptance; they are scheduled into specific P-phases:
@github/copilot-sdk exact version — research §9 item 8; resolves at install time on the build host.@claude-flow/security TokenCache — research §9 item 7; confirmed absent. P1 implements file-cache directly using PathValidator from @claude-flow/security (already exists) to validate the ~/.config/ruflo/copilot/ directory.The package skeleton, the src/client/ SDK wrapper, the src/mcp/ bridge,
the MultiModeOrchestrator, the /loop runner, all generators / validators
/ migrations / templates, the CLI bin, the initializer, the README, and the
package-level AGENTS.md are all in tree at v3/@claude-flow/copilot/.
Build metrics:
npm install --no-package-lock --no-workspaces succeeded in ~5s (198 packages).npm run build (tsc) is clean — zero TypeScript errors.npm test (vitest run) reports 56 tests across 6 files, all passing, in ~485ms.File-size discipline:
.ts files in src/, totaling 3,582 lines.src/dual-mode/orchestrator.ts at 363 lines.v3/CLAUDE.md.Secret-scan:
grep -rE 'ghp_[a-zA-Z0-9]|gho_[a-zA-Z0-9]|github_pat_|sk-[a-zA-Z0-9]{20,}' src/ returns only the legitimate detector pattern inside src/validators/index.ts.What ships in this commit:
src/client/models.ts + tests — Tier 2/3 routing table, getOptimalModel() (10 test cases).src/client/auth.ts + tests — resolveCredential() returns SOURCE identifiers only, never the token value. Cache file path ~/.config/ruflo/copilot/token.json, chmod 600. CopilotAuthRequiredError with stable code.src/client/chat.ts — CopilotClient with dual mode: dynamic import of @github/copilot-sdk (optional), falling back to driving the bundled copilot CLI in -p/--prompt mode. runGoverned() wires the four hook lifecycle calls.src/client/tools.ts — defineCopilotTool() + CopilotToolRegistry.src/mcp/register.ts + tests — registerRufloMcpWithCopilot() builds the local-stdio mcpServers map. buildMcpServers() merges caller-supplied servers.src/mcp/bridge.ts + tests — per-session MCP tool-call audit.src/dual-mode/orchestrator.ts + tests — MultiModeOrchestrator extending EventEmitter, supporting all three platforms (claude / codex / copilot). TriModeCollaborationTemplates for feature / security / refactor pipelines.src/dual-mode/cli.ts — claude-flow-copilot dual run command with three-platform spec parser.src/loop/index.ts + cli.ts — runCopilotLoop() mirrors runCodexLoop() shape, calls Copilot via runGoverned() each iteration. State in .copilot/loop/<name>.json.src/generators/{agents-md,config-toml,skill-md}.ts — Copilot-flavored emitters. JSON config is canonical (.copilot/config.json); TOML preserved for codex symmetry.src/migrations/index.ts — migrateFromClaudeCode() + new migrateFromCodex() per research §8.src/validators/index.ts — AGENTS.md / SKILL.md / config validators. Config validator additionally warns on retiring models (gpt-4.1, gpt-5.2-codex, etc.).src/initializer.ts — CopilotInitializer writes AGENTS.md + .copilot/config.json + skill stubs + .copilot/AGENTS.override.md. Updates .gitignore. Dual mode optionally emits CLAUDE.md.src/cli.ts — claude-flow-copilot bin with init, auth status/auth clear, mcp register, chat, doctor, info, dual, loop subcommands.models, auth, register, orchestrator, initializer, validators — 56 assertions total.Copilot CLI integration (dogfooding):
The GitHub Copilot CLI (@github/copilot v1.0.59) was installed globally during this work. It has a non-interactive -p/--prompt mode (with --allow-all-tools required for unattended use); src/client/chat.ts's CLI fallback uses exactly that surface. Authoring the package itself was done by Claude Code rather than driven through copilot -p calls, because the CLI requires interactive gh auth login first and the development host's auth state was not present in this session — npx claude-flow-copilot auth status would surface COPILOT_AUTH_REQUIRED and exit 1. Once the user authenticates (gh auth login), the package can be self-dogfooded for any subsequent code generation. The CLI fallback path in chat.ts was tested end-to-end at the type level (vitest covers the path through the dry-run flag in tests).
Not in this commit (deferred to follow-up PRs):
--copilot flag in the top-level @claude-flow/cli init command (P5).COPILOT_E2E=1 — requires authenticated CI infrastructure.Validation gate (re-runnable):
cd v3/@claude-flow/copilot
npm install --no-package-lock --no-workspaces
npm run build # tsc clean
npm test # 56/56 passing
find src -name '*.ts' | xargs wc -l | awk '$1 > 500 { print }' # empty
grep -rE 'ghp_[a-zA-Z0-9]|gho_[a-zA-Z0-9]|github_pat_|sk-[a-zA-Z0-9]{20,}' src/ # only validator detector