commands/santa-loop.md
Adversarial dual-review convergence loop using the santa-method skill. Two independent reviewers — different models, no shared context — must both return NICE before code ships.
Run two independent reviewers (Claude Opus + an external model) against the current task output. Both must return NICE before the code is pushed. If either returns NAUGHTY, fix all flagged issues, commit, and re-run fresh reviewers — up to 3 rounds.
/santa-loop [file-or-glob | description]
Determine the scope from $ARGUMENTS or fall back to uncommitted changes:
git diff --name-only HEAD
Read all changed files to build the full review context. If $ARGUMENTS specifies a path, file, or description, use that as the scope instead.
Construct a rubric appropriate to the file types under review. Every criterion must have an objective PASS/FAIL condition. Include at minimum:
| Criterion | Pass Condition |
|---|---|
| Correctness | Logic is sound, no bugs, handles edge cases |
| Security | No secrets, injection, XSS, or OWASP Top 10 issues |
| Error handling | Errors handled explicitly, no silent swallowing |
| Completeness | All requirements addressed, no missing cases |
| Internal consistency | No contradictions between files or sections |
| No regressions | Changes don't break existing behavior |
Add domain-specific criteria based on file types (e.g., type safety for TS, memory safety for Rust, migration safety for SQL).
Launch two reviewers in parallel using the Agent tool (both in a single message for concurrent execution). Both must complete before proceeding to the verdict gate.
Each reviewer evaluates every rubric criterion as PASS or FAIL, then returns structured JSON:
{
"verdict": "PASS" | "FAIL",
"checks": [
{"criterion": "...", "result": "PASS|FAIL", "detail": "..."}
],
"critical_issues": ["..."],
"suggestions": ["..."]
}
The verdict gate (Step 4) maps these to NICE/NAUGHTY: both PASS → NICE, either FAIL → NAUGHTY.
Launch an Agent (subagent_type: code-reviewer, model: opus) with the full rubric + all files under review. The prompt must include:
First, detect which CLIs are available:
command -v codex >/dev/null 2>&1 && echo "codex" || true
command -v gemini >/dev/null 2>&1 && echo "gemini" || true
Build the reviewer prompt (identical rubric + instructions as Reviewer A) and write it to a unique temp file:
PROMPT_FILE=$(mktemp /tmp/santa-reviewer-b-XXXXXX.txt)
cat > "$PROMPT_FILE" << 'EOF'
... full rubric + file contents + reviewer instructions ...
EOF
Use the first available CLI:
Codex CLI (if installed)
codex exec --sandbox read-only -m gpt-5.4 -C "$(pwd)" - < "$PROMPT_FILE"
rm -f "$PROMPT_FILE"
Gemini CLI (if installed and codex is not)
gemini -p "$(cat "$PROMPT_FILE")" -m gemini-2.5-pro
rm -f "$PROMPT_FILE"
Claude Agent fallback (only if neither codex nor gemini is installed)
Launch a second Claude Agent (subagent_type: code-reviewer, model: opus). Log a warning that both reviewers share the same model family — true model diversity was not achieved but context isolation is still enforced.
In all cases, the reviewer must return the same structured JSON verdict as Reviewer A.
fix: address santa-loop review findings (round N)
Maximum 3 iterations. If still NAUGHTY after 3 rounds, stop and present remaining issues:
SANTA LOOP ESCALATION (exceeded 3 iterations)
Remaining issues after 3 rounds:
- [list all unresolved critical issues from both reviewers]
Manual review required before proceeding.
Do NOT push.
When both reviewers return PASS:
git push -u origin HEAD
Print the output report (see Output section below).
SANTA VERDICT: [NICE / NAUGHTY (escalated)]
Reviewer A (Claude Opus): [PASS/FAIL]
Reviewer B ([model used]): [PASS/FAIL]
Agreement:
Both flagged: [issues caught by both]
Reviewer A only: [issues only A caught]
Reviewer B only: [issues only B caught]
Iterations: [N]/3
Result: [PUSHED / ESCALATED TO USER]
--sandbox read-only (Codex) to prevent repo mutation during review.