docs/codex-first-control-plane-roadmap.md
This document proposes how Flow should evolve from "helpful CLI + local skills" into a Codex-first control plane where the user stays inside Codex and Flow handles routing, memory, execution, and learning behind the scenes.
Target state:
Example desired behavior:
document it resolves to the docs write flowcontinue the last deploy investigation finds the right session/worktreeforge doc, forge linear inspect, or
repo-specific wrappersCurrent Flow has strong building blocks but they are still separate:
But the user still pays too much cognitive cost:
L and repo-specific launchers carry logic outside FlowThe result is "good pieces, weak control plane".
These are enough to start. The missing work is unification.
Flow should target current upstream Codex directly.
That means:
skills/list, and thread/*skills/list perCwdExtraUserRoots and in-process app-server clients as accelerators, not prerequisitesAdd a Flow-managed warm control layer, either as an extension of ai-taskd, a
focused codexd, or a lighter in-process broker where that is enough for the
current upstream Codex client surface.
Responsibilities:
This should absorb behavior that currently lives in wrappers like L.
Promote Forge-style phrase aliasing into Flow as a generic feature.
Each intent has:
Examples:
doc-itlinear-referencesession-recoverreview-intent-commentIntent matching must stay deterministic and cheap.
Flow should ship a generic resolver layer for pasted references:
Resolvers return structured payloads, not prose. Repo-local executors like Forge can register resolver commands for domain-specific expansion.
Split Codex knowledge into two layers:
Examples:
document it
Runtime skills should expire automatically and be bounded by a strict budget.
Use router telemetry plus transcript mining to propose:
Important:
Add a small command family around the new control plane:
f codex open [query]
f codex resolve "<text-or-url>" [--json]
f codex runtime
f codex runtime show
f codex runtime clear
f codex teach suggest
f codex teach accept <intent-or-suggestion-id>
f codex teach reject <intent-or-suggestion-id>
f codex doctor
f codexd start|stop|status
Intended behavior:
f codex open replaces personal wrappers like Lf codex resolve shows what Flow would unroll or route before Codex sees itf codex runtime show explains which runtime skills/context are activef codex teach suggest presents evidence-backed alias/intent suggestionsf codex doctor exposes repo path, active app-server connection, runtime
budget, skill count, and recent resolver hitsProposed flow.toml additions:
[codex]
control_plane = "daemon"
warm_app_server = true
runtime_skill_budget_chars = 1200
auto_resolve_references = true
auto_learn = "suggest-only"
[codex.session]
open_command = "codex"
prefer_last_active = true
repo_scoped_lookup = true
[[codex.intent]]
name = "doc-it"
phrases = ["doc it", "document it", "write this down", "save this in docs"]
resolver = "docs.route_write"
scope = ["repo", "personal"]
[[codex.intent]]
name = "session-recover"
phrases = ["what was i doing", "recover recent context", "continue the work"]
resolver = "session.recover"
[[codex.reference_resolver]]
name = "linear"
match = ["https://linear.app/*/issue/*", "https://linear.app/*/project/*"]
command = "forge linear inspect {{ref}} --json"
inject_as = "linear"
[[codex.reference_resolver]]
name = "docs"
match = ["doc it", "document it"]
command = "forge doc route --title {{title}} --json"
inject_as = "docs"
Also add a personal/global config file for user-specific phrase preferences:
~/.config/flow/codex-intents.tomlUse this for personal language variants that should not live in repo config.
codexd should own:
f skills reload and f ai codex ... flowsIt should not:
The runtime layer needs hard limits:
Budget policy should prefer:
Inputs:
Outputs:
Approval model:
Forge should remain the Prom executor for Prom-specific workflows.
Flow should absorb the generic pieces Forge proved useful:
That means:
forge linear inspect, forge doc, and similar domain commandsL-style session open/recover behavior into f codex opendoctor view for current skill/runtime statecodexd with persistent app-server connection per repof codex runtime showThe highest-value first slice is:
f codex opencodexd with warm repo-scoped app-serverf codex resolvef codex runtime showWhy this first:
f codex open latencyThe target system is not "more AGENTS text" and not "more commands for the user to remember".
It is:
That is how Flow becomes truly Codex-first while keeping context cost low.