packages/omo-codex/plugin/skills/start-work/SKILL.md
This skill ports the OpenCode /start-work flow onto Codex. Any OpenCode-only tool name in an inherited example must be translated to its Codex equivalent:
| OpenCode example | Codex tool to use |
|---|---|
task(subagent_type="explore", ...) | spawn_agent({"task_name":"...","message":"TASK: act as an explorer. ...","fork_turns":"none"}) |
task(subagent_type="librarian", ...) | spawn_agent({"task_name":"...","message":"TASK: act as a librarian. ...","fork_turns":"none"}) |
task(subagent_type="plan", ...) | spawn_agent({"task_name":"...","message":"TASK: act as a planning agent. ...","fork_turns":"none"}) |
task(subagent_type="oracle", ...) for final verification | spawn_agent({"task_name":"...","message":"TASK: act as a rigorous reviewer. ...","fork_turns":"none"}) |
task(category="...", ...) for implementation or QA | spawn_agent({"task_name":"...","message":"TASK: act as an implementation or QA worker. ...","fork_turns":"none"}) |
background_output(task_id="...") | wait_agent(...) for mailbox signals; after a timeout, run one list_agents check for the named child if reassurance is needed |
dispatchInternalPrompt(...) | the Stop hook emits {"decision":"block","reason":"<prompt>"} automatically; see Continuation |
team_*(...) | spawn_agent + send_message + followup_task + wait_agent + close_agent |
When translating load_skills=[...], name the skills inside the spawned agent's message. If a code block below conflicts with this section, this section wins.
Every spawn_agent message must be self-contained. Start with
TASK: <imperative assignment>, then name DELIVERABLE, SCOPE, and
VERIFY. State that it is an executable assignment, not a context
handoff. Role or specialty instructions belong inside message; the
Codex tool schema only accepts task_name, message, and fork_turns.
Prefer fork_turns: "none" unless full history is truly
required; paste only the context the child needs.
Plan and reviewer agents may run for a long time; spawn them in the background, keep doing independent root work, and poll with short wait_agent cycles sized to the work. Never use a single long blocking wait for them, and never spin on tiny timeouts as a failure budget.
Treat child status as a progress signal, not a timeout counter. For
work likely to exceed one wait cycle, require the child to send
WORKING: <task> - <current phase> before long reading, testing, or
review passes, and BLOCKED: <reason> only when it cannot progress.
While any child is active, keep the parent visibly alive with active
subagent count, agent names, latest WORKING: phase, and whether the
parent is waiting for mailbox updates. Track spawned agent names
locally. Use wait_agent for mailbox signals, not proof of completion.
A timeout only means no new mailbox update arrived; after a timeout,
run a single list_agents check for the named child when you need
reassurance. If it is running or its latest message is WORKING:,
treat it as alive. Do not use list_agents as a polling loop or status
feed; it can replay large payloads. Fallback only when the child is
completed without the deliverable, ack-only after followup, explicitly
BLOCKED:, or no longer running. Then record the result as
inconclusive, do not count it as pass/review approval, close if safe,
and respawn a smaller fork_turns: "none" task with the missing
deliverable.
Execute a Prometheus work plan until every top-level checkbox is complete. This skill pairs with the Codex Stop / SubagentStop continuation hook in components/start-work-continuation, which re-injects the next turn while .omo/boulder.json says the current codex:<session_id> still has unchecked plan work.
$start-work [plan-name] [--worktree <absolute-path>]
plan-name is optional. It may be a full or partial file stem under .omo/plans/.--worktree is optional. Use it only when the user explicitly asks to work in a separate git worktree..omo/boulder.json if it exists..omo/plans/.plan-name was provided, select the matching plan.When the user explicitly said start work / $start-work and no selectable plan exists, treat that phrase as approval to create the plan before execution. Do not stall on a missing plan and do not ask for generic approval again.
If no selectable plan exists, bootstrap ulw-plan before execution.
Execution requires an approved plan before implementation; bootstrap mode creates that approved plan from the user's start work request instead of skipping planning.
ulw-plan skill from the current request and require its dynamic adversarial workflow: collect, verify, design, adversarial plan-review, synthesize..omo/plans/<slug>.md before implementation or Boulder state writes that point at plan work.start work request is the bootstrap approval to create the plan and begin execution.Write .omo/boulder.json before implementation starts. Session ids must be prefixed with codex: so the continuation hook can identify its own session.
{
"schema_version": 2,
"active_work_id": "<work-id>",
"works": {
"<work-id>": {
"work_id": "<work-id>",
"active_plan": ".omo/plans/<plan-name>.md",
"plan_name": "<plan-name>",
"session_ids": ["codex:<session_id>"],
"status": "active",
"worktree_path": null
}
}
}
If --worktree is set, verify the path with git worktree list --porcelain or create it with git worktree add <path> <branch-or-HEAD>, then store the absolute path as worktree_path. All edits, commands, tests, and evidence capture must run inside that worktree.
## TODOs or ## Final Verification Wave.spawn_agent; serialize only when one sub-task has a named dependency on another.Each sub-task message must include:
curl, send-keys, page.click, payload, selectors, and the binary observable that decides PASS/FAIL), not "verify it works":
curl -i against the live endpoint.tmux session driven with send-keys, dumped via capture-pane.Apply ultraqa's 9 adversarial classes where relevant to each checkbox: malformed input, prompt injection, cancel/resume, stale state, dirty worktree, hung or long commands, flaky tests, misleading success output, repeated interruptions. A checkbox whose behavior is user-visible MUST probe every class that plausibly applies; record which classes were exercised and which were ruled not-applicable with a one-line reason.
For each checkbox, complete all five gates before marking it done:
Append evidence to .omo/start-work/ledger.jsonl using one JSON object per line. Include at least event, plan, task, session_id, commands, artifact, adversarial_classes, and cleanup fields. adversarial_classes lists each probed class with its observable result and each ruled-out class with a one-line reason.
A worker done claim is never final. Each implementation sub-task returns a DoneClaim, then a different context runs AdversarialVerify, then the verifier probes or reproduces the claim, then failures loop back to the executor, and only a confirmed verifier verdict becomes FullyDone.
{
"DoneClaim": {
"task": "<task id/title>",
"changed_files": ["path"],
"tests": ["exact command + result"],
"manual_qa": ["artifact path"],
"cleanup": ["receipt"],
"risks": ["known risk or none"]
},
"AdversarialVerify": {
"verdict": "confirmed | false-positive | needs-fix | needs-human-review",
"evidence": ["file path, command, log, artifact, or explicit not inspected"],
"repro": "exact command or manual steps when available",
"confidence": 0.0
}
}
Rules:
confirmed is the only pass verdict. false-positive, needs-fix, and needs-human-review all block checkbox completion.codex-ultrawork-reviewer, a scoped worker reviewer, or root only when root did not implement or materially rewrite that task.stale_state, dirty_worktree, and misleading_success_output, before allowing FullyDone.Only after verification passes:
- [ ] to - [x].task-completed ledger entry.When all top-level checkboxes in ## TODOs and ## Final Verification Wave are complete:
review-work skill with the final diff, changed files, user goal, constraints, run command, and verification evidence. All five review lanes must return PASS. A timeout, missing deliverable, ack-only child, BLOCKED:, or inconclusive lane is a gate failure, not approval..omo/start-work/ledger.jsonl.debugging skill, confirm root cause with runtime evidence, add the minimal failing test or reproduction, fix it, rerun the affected verification, then rerun the Global Review and Debugging Gate..omo/start-work/ledger.jsonl, a PR body, or a handoff. Never include raw tokens, credentials, auth headers, cookies, API keys, env dumps, private logs, or PII; use concise summaries, lengths, hashes, or short non-sensitive prefixes instead.git status and the PR/branch state after the gate, and include only redacted review/debugging evidence in the PR body or handoff..omo/ state back to the main repo, merge or hand off exactly as requested, and remove the worktree only after successful merge or explicit handoff.ORCHESTRATION COMPLETE block with the plan path, verification commands, Global Review and Debugging Gate verdict, artifacts, and cleanup receipts.--dry-run as completion evidence.ORCHESTRATION COMPLETE, final response, PR creation, or PR handoff before the Global Review and Debugging Gate passes with recorded evidence.codex:<session_id>.