Back to Oh My Openagent

Codex QA

.agents/skills/codex-qa/SKILL.md

4.11.05.3 KB
Original Source

Codex QA

QA the omo Codex Light edition (packages/omo-codex/, shipped as lazycodex). We exercise OUR plugin in a REAL Codex while touching nothing of the user's setup: an isolated CODEX_HOME + a local mock model means no real API call and the real ~/.codex is never read or written. Each helper script ships a --self-test that asserts its scenario against the live machine, so the scripts are both the QA tools and their own regression checks.

Verified against codex-cli 0.140.0 (node, jq, tmux, bun on macOS). Confirm with codex --version; check a flag with codex <cmd> --help.

Golden rules (read before running anything)

  • QA ONLY our plugin. Everything that spawns codex uses an isolated CODEX_HOME (created by cqa_mk_isolated_home) and a LOCAL mock model provider (cqa_start_mock). Never QA against the real ~/.codex, never hit a real model API. The bundled scripts enforce this; if you run codex by hand, export CODEX_HOME="$(mktemp -d)/codex"; mkdir -p "$CODEX_HOME" FIRST (a set CODEX_HOME must already exist or codex hard-errors).
  • Prove the real home stayed clean. Every script shasums ~/.codex/config.toml before and after and asserts it is unchanged. If you script by hand, do the same.
  • The interactive codex is a shell function that injects --profile quotio. Bash scripts bypass it and get the real binary; never rely on the interactive alias. See references/isolation.md.
  • The first-party way to prove a hook fired is the app-server notification stream (hook/started / hook/completed), not log scraping. See references/app-server.md.
  • The captured JSON / pane IS the evidence — write it under .omo/evidence/<YYYYMMDD>-<slug>/ (no evidence file == the QA did not happen).

Setup

bash
cd <this-skill-dir>                        # .agents/skills/codex-qa
bash scripts/lib/common.sh --self-check    # confirm deps + isolation harness

Docker is the default QA surface. Run this QA inside a disposable container that has the latest codex and a copy of your config, with the host ~/.codex untouched: script/agent/qa-docker.sh (see references/docker-qa.md). The local scripts below are the fallback for when Docker is unavailable or on Windows.

Router: pick your case

You need to…RunDeep dive
Prove a plugin hook fires in a LIVE Codex turn (first-party)scripts/app-server-drive.sh --pluginapp-server.md
Prove the app-server driver itself works (no plugin, fast)scripts/app-server-drive.sh --self-testapp-server.md
Install the LOCAL build into an isolated home + assert it landedscripts/install-verify.sh --self-testinstall-verify.md
Pin ONE component's hook logic deterministically (no codex)scripts/hook-unit-probe.sh --self-testcomponents-hooks.md
Smoke the real TUI under tmux (boots, renders, survives)scripts/tui-smoke.sh --self-testlogging-debug.md
Watch runtime logs while QAing(see reference; RUST_LOG / logs DB / /debug-config)logging-debug.md

Scripts index (each is its own regression test)

Script--self-test asserts
scripts/lib/common.sh --self-checkdeps present; isolated CODEX_HOME is created inside a sandbox and auto-removed on exit; mock model serves the Responses SSE; real ~/.codex unchanged
scripts/app-server-drive.sh--self-test: a bare turn completes and the mock assistant text comes back. --plugin: installs local omo, drives a turn, and asserts hook/completed for sessionStart,userPromptSubmit
scripts/install-verify.shlocal omo installs into the isolated home; config.toml enables omo@sisyphuslabs; component bins + agent TOMLs linked in the sandbox; real ~/.codex unchanged
scripts/hook-unit-probe.shthe ultrawork component injects <ultrawork-mode> on an ulw UserPromptSubmit (also a manual --component/--event mode)
scripts/tui-smoke.shthe real codex TUI boots in the isolated home, renders, and survives (no early exit); captures the pane

Match QA to your change scope

  • Component / hook logic (packages/omo-codex/plugin/components/*): hook-unit-probe.sh for the exact stdout, THEN app-server-drive.sh --plugin to prove the live wiring. See components-hooks.md.
  • Installer / config.toml (packages/omo-codex/src/install/*): install-verify.sh.
  • Anything that affects a live session (hooks, agents, MCP wiring): app-server-drive.sh --plugin, and tui-smoke.sh --plugin if the TUI path matters.

Capturing evidence

bash
ev=".omo/evidence/$(date +%Y%m%d)-codex-qa-<slug>"; mkdir -p "$ev"
bash scripts/app-server-drive.sh --plugin > "$ev/app-server-drive.json" 2>&1
bash scripts/install-verify.sh --self-test > "$ev/install-verify.txt" 2>&1

On /debugging

There is no /debugging command in Codex. To observe a run: the app-server notification stream (above), RUST_LOG=debug on the app-server's stderr, the logs SQLite under $CODEX_HOME, the TUI's /debug-config, and the codex debug … subcommands. See logging-debug.md.