docs/guide/agent-model-matching.md
For agents and users: Why each agent needs a specific model — and how to customize without breaking things.
Think of AI models as developers on a team. Each has a different brain, different personality, different strengths. A model isn't just "smarter" or "dumber." It thinks differently. Give the same instruction to Claude and GPT, and they'll interpret it in fundamentally different ways.
This isn't a bug. It's the foundation of the entire system.
Oh My OpenAgent assigns each agent a model that matches its working style — like building a team where each person is in the role that fits their personality.
Sisyphus is the developer who knows everyone, goes everywhere, and gets things done through communication and coordination. Talks to other agents, understands context across the whole codebase, delegates work intelligently, and codes well too. But deep, purely technical problems? He'll struggle a bit.
This is why Sisyphus uses Claude / Kimi / GLM. These models excel at:
Using Sisyphus with older GPT models would be like taking your best project manager — the one who coordinates everyone, runs standups, and keeps the whole team aligned — and sticking them in a room alone to debug a race condition. Wrong fit. GPT-5.4 and GPT-5.5 now have dedicated Sisyphus prompt paths, but GPT is still not the default recommendation for the orchestrator.
Hephaestus is the developer who stays in their room coding all day. Doesn't talk much. Might seem socially awkward. But give them a hard technical problem and they'll emerge three hours later with a solution nobody else could have found.
This is why Hephaestus uses GPT-5.5. GPT-5.5 is built for exactly this:
Using Hephaestus with GLM or Kimi would be like assigning your most communicative, sociable developer to sit alone and do nothing but deep technical work. They'd get it done eventually, but they wouldn't shine — you'd be wasting exactly the skills that make them valuable.
Every agent's prompt is tuned to match its model's personality. When you change the model, you change the brain — and the same instructions get understood completely differently. Model matching isn't about "better" or "worse." It's about fit.
This matters for understanding why some agents support both model families while others don't.
Claude responds to mechanics-driven prompts — detailed checklists, templates, step-by-step procedures. More rules = more compliance. You can write a 1,100-line prompt with nested workflows and Claude will follow every step.
GPT (especially 5.2+) responds to principle-driven prompts — concise principles, XML structure, explicit decision criteria. More rules = more contradiction surface = more drift. GPT works best when you state the goal and let it figure out the mechanics.
Real example: Prometheus's Claude prompt is ~1,100 lines across 7 files. The GPT prompt achieves the same behavior with 3 principles in ~121 lines. Same outcome, completely different approach.
Agents that support both families (Prometheus, Atlas) auto-detect your model at runtime and switch prompts via isGptModel(). You don't have to think about it.
Before configuring anything, see what your current system can run.
opencode models
This prints every provider/model combination you can address right now. Providers are derived from your connected auth + the models.dev catalogue.
Opencode sorts the output so opencode* providers appear first — that's intentional, not cosmetic.
opencode auth list
Shows which providers you've already logged into.
You need to log in to that provider:
opencode auth login
The interactive picker prioritizes providers in this order:
| Priority | Provider | Opencode's own hint |
|---|---|---|
| 0 | opencode | (Recommended) |
| 1 | opencode-go | Low cost subscription for everyone |
| 2 | openai | ChatGPT Plus/Pro or API key |
| 3 | github-copilot | — |
| 4 | anthropic | API key |
| 5 | google | — |
You can also skip the picker: opencode auth login --provider opencode-go.
bunx oh-my-opencode doctor
This shows the effective model resolution for every agent and category based on your current auth state. If an agent says "system-default" instead of a real fallback, that's a signal you're missing providers from its chain.
You don't need every provider. You need the right two.
~$30/month total. Beats direct Anthropic + OpenAI + Google subscriptions (~$60+/month) on both cost and coverage.
| Subscription | Cost | What You Get | Covers |
|---|---|---|---|
| OpenCode Go | $10/mo | kimi-k2.5, kimi-k2.6, glm-5, glm-5.1, minimax-m2.5, minimax-m2.7, mimo-v2-pro, qwen3.5-plus, qwen3.6-plus | Claude-family alternatives (Kimi, GLM), Gemini-family alternatives (Qwen), utility/retrieval (MiniMax) |
| OpenAI Plus/Pro | $20+/mo | gpt-5.4, gpt-5.4-pro, gpt-5.5, gpt-5.3-codex | GPT-native agents (Hephaestus, Oracle, Momus), dual-prompt agents' GPT path |
Add --claude=max20 (or yes) on install. Claude Opus 4.7 becomes the default for Sisyphus/Prometheus/Atlas and you still get the OpenCode Go fallbacks for free. Best-in-class orchestration + budget safety net.
OpenCode Go alone gets Sisyphus/Atlas/Oracle/Librarian/Explore working. Hephaestus won't activate without GPT access, so you lose autonomous deep work. Consider adding ChatGPT Plus as soon as you can.
When the "native" model isn't available, oh-my-openagent walks each agent's fallback chain until something connects. The chains are hardcoded in src/shared/model-requirements.ts. There is no single global priority list. Every agent and category has its own chain.
There are two separate systems:
chat.params using hardcoded AGENT_MODEL_REQUIREMENTS and CATEGORY_MODEL_REQUIREMENTSsession.error, configurable per category/agent in runtime-fallback hooksUsed by: Sisyphus, Atlas, Sisyphus-Junior, Metis (Claude path), Prometheus (Claude path), unspecified-low, unspecified-high.
| Priority | Model | Provider | Why |
|---|---|---|---|
| 1 | claude-opus-4-7 (max) | anthropic, github-copilot, opencode, vercel | Best overall compliance with ~1,100-line Sisyphus prompt. |
| 2 | claude-sonnet-4-6 | same | Faster, cheaper, still Claude. |
| 3 | kimi-k2.5 or kimi-k2.6 — RECOMMENDED ALTERNATIVE | opencode-go, kimi-for-coding, moonshotai, opencode, vercel | Instruction-following mirrors Claude closely. Default orchestrator when Anthropic isn't connected. |
| 4 | glm-5 or glm-5.1 — ACCEPTABLE ALTERNATIVE | opencode-go, zai-coding-plan, opencode, vercel | Claude-like, slightly looser on long nested workflows. Solid fallback. |
| 5 | big-pickle (GLM 4.6) | opencode | Free-tier safety net. |
Kimi ≻ GLM. Kimi K2.5/2.6 hold up under Sisyphus's nested todo+delegation prompts better than GLM. Use Kimi whenever both are available.
Used by: Hephaestus, Oracle, Momus, deep, ultrabrain, quick, Prometheus (GPT path), Atlas (GPT path).
| Priority | Model | Provider | Why |
|---|---|---|---|
| 1 | gpt-5.5 / gpt-5.4 (pro / xhigh / high / medium) | openai, github-copilot, opencode, vercel | Native OpenAI is the gold standard for principle-driven prompts. Hephaestus requires this family. |
| 2 | gpt-5.3-codex | same | Still the deep-coding powerhouse. Kept as an explicit override option. |
| 3 | DeepSeek — LIMITED ALTERNATIVE (deepseek-v3.2, deepseek-chat-v3.1) | openrouter/deepseek | Closest OSS equivalent for autonomous coding behavior. Not wired into default chains — add via fallback_models. |
| 4 | MiniMax — STRONGLY DISCOURAGED (minimax-m2.7, minimax-m2.5) | opencode-go, opencode, openrouter/minimax | Used only in utility fallback chains (Explore, Librarian, quick). Consistency and long-context management issues make it a poor substitute for Hephaestus/Oracle. Do NOT override deep agents to MiniMax. |
DeepSeek ≻≻ MiniMax. DeepSeek retains GPT's autonomous exploration character. MiniMax loses coherence on multi-step deep work. MiniMax is fine for grep-style utility agents, nothing more.
Used by: visual-engineering, artistry, Oracle (visual fallback), Multimodal-Looker.
| Priority | Model | Provider | Why |
|---|---|---|---|
| 1 | gemini-3.1-pro (high) | google, github-copilot, opencode, vercel | Best for UI/UX, CSS, design tokens, layout decisions. artistry category requires this family. |
| 2 | gemini-3-flash | same | Fast variant, writing/doc tasks. |
| 3 | Qwen — ALTERNATIVE (qwen3.6-plus, qwen3.5-plus) | opencode-go, openrouter/qwen | Closest vision-capable substitute when Google isn't connected. Uses different reasoning style but handles visual tasks competently. |
No GLM/Kimi here. They're not Gemini substitutes for visual work. Use Qwen.
| If you lose... | Swap to (in order) | Avoid |
|---|---|---|
| Claude Opus/Sonnet | Kimi K2.5/K2.6 → GLM 5 → Big Pickle | Older GPT models |
| GPT-5.4/5.5 | GPT-5.3 Codex → DeepSeek v3.2 | MiniMax (except for utility work) |
| Gemini 3.1 Pro | Qwen 3.6-plus / 3.5-plus | Claude/Kimi (wrong reasoning style for visual) |
| Grok Code Fast 1 (Explore) | GPT-5.4 Mini Fast → MiniMax M2.7 Highspeed → Claude Haiku | Opus (massive cost waste) |
Exact runtime chains from src/shared/model-requirements.ts.
These agents have Claude-optimized prompts — long, detailed, mechanics-driven. They need models that reliably follow complex, multi-layered instructions.
| Agent | Role | Fallback Chain |
|---|---|---|
| Sisyphus | Main orchestrator | anthropic|github-copilot|opencode|vercel/claude-opus-4-7 (max) → opencode-go|vercel/kimi-k2.6 → kimi-for-coding/k2p5 → opencode|moonshotai|moonshotai-cn|firmware|ollama-cloud|aihubmix|vercel/kimi-k2.5 → openai|github-copilot|opencode|vercel/gpt-5.5 (medium) → zai-coding-plan|opencode|vercel/glm-5 → opencode/big-pickle |
| Metis | Plan gap analyzer | anthropic|github-copilot|opencode|vercel/claude-sonnet-4-6 → anthropic|github-copilot|opencode|vercel/claude-opus-4-7 (max) → openai|github-copilot|opencode|vercel/gpt-5.5 (high) → opencode-go|vercel/glm-5.1 → kimi-for-coding/k2p5 |
These agents ship separate prompts for Claude and GPT families. They auto-detect your model and switch at runtime.
| Agent | Role | Fallback Chain |
|---|---|---|
| Prometheus | Strategic planner | anthropic|github-copilot|opencode|vercel/claude-opus-4-7 (max) → openai|github-copilot|opencode|vercel/gpt-5.5 (high) → opencode-go|vercel/glm-5.1 → google|github-copilot|opencode|vercel/gemini-3.1-pro |
| Atlas | Todo orchestrator | anthropic|github-copilot|opencode|vercel/claude-sonnet-4-6 → opencode-go|vercel/kimi-k2.6 → openai|github-copilot|opencode|vercel/gpt-5.5 (medium) → opencode-go|vercel/minimax-m2.7 |
These agents are built for GPT's principle-driven style. Their prompts assume autonomous, goal-oriented execution. Don't override to Claude.
| Agent | Role | Fallback Chain |
|---|---|---|
| Hephaestus | Autonomous deep worker | openai|github-copilot|venice|opencode|vercel/gpt-5.5 (medium) — single-entry chain, requires one of those providers. The craftsman. |
| Oracle | Architecture consultant | openai|github-copilot|opencode|vercel/gpt-5.5 (high) → google|github-copilot|opencode|vercel/gemini-3.1-pro (high) → anthropic|github-copilot|opencode|vercel/claude-opus-4-7 (max) → opencode-go|vercel/glm-5.1 |
| Momus | Ruthless reviewer | openai|github-copilot|opencode|vercel/gpt-5.5 (xhigh) → anthropic|github-copilot|opencode|vercel/claude-opus-4-7 (max) → google|github-copilot|opencode|vercel/gemini-3.1-pro (high) → opencode-go|vercel/glm-5.1 |
These agents do grep, search, and retrieval. They intentionally use the fastest, cheapest models available. Don't "upgrade" them to Opus — that's hiring a senior engineer to file paperwork.
| Agent | Role | Fallback Chain |
|---|---|---|
| Explore | Fast codebase grep | openai/gpt-5.4-mini-fast → opencode-go/qwen3.5-plus → vercel/minimax-m2.7-highspeed → opencode-go|vercel/minimax-m2.7 → anthropic|opencode|vercel/claude-haiku-4-5 → openai|opencode|vercel/gpt-5.4-nano |
| Librarian | Docs/code search | same as Explore |
| Multimodal Looker | Vision/screenshots | openai|opencode|vercel/gpt-5.5 (medium) → opencode-go|vercel/kimi-k2.6 → zai-coding-plan|vercel/glm-4.6v → openai|github-copilot|opencode|vercel/gpt-5-nano |
| Sisyphus-Junior | Category executor | anthropic|github-copilot|opencode|vercel/claude-sonnet-4-6 → opencode-go|vercel/kimi-k2.6 → openai|github-copilot|opencode|vercel/gpt-5.5 (medium) → opencode-go|vercel/minimax-m2.7 → opencode/big-pickle |
Communicative, instruction-following, structured output. Best for agents that need to follow complex multi-step prompts.
| Model | Strengths |
|---|---|
| Claude Opus 4.7 | Best overall. Highest compliance with complex prompts. Default for Sisyphus. |
| Claude Sonnet 4.6 | Faster, cheaper. Good balance for everyday tasks. |
| Claude Haiku 4.5 | Fast and cheap. Good for quick tasks and utility work. |
| Kimi K2.5 | Behaves very similarly to Claude. Great all-rounder at lower cost. |
| GLM 5 | Claude-like behavior. Solid for orchestration tasks. |
Principle-driven, explicit reasoning, deep technical capability. Best for agents that work autonomously on complex problems.
| Model | Strengths |
|---|---|
| GPT-5.3 Codex | Deep coding powerhouse. Autonomous exploration. Still available for deep category and explicit overrides. |
| GPT-5.5 | High intelligence, strategic reasoning. Default for Oracle, Momus, and a key fallback for Prometheus / Atlas. Uses xhigh variant for Momus. |
| GPT-5.4 Mini | Fast + strong reasoning. Good for lightweight autonomous tasks. Default for quick category. |
| GPT-5-Nano | Ultra-cheap, fast. Good for simple utility tasks. |
| Model | Strengths |
|---|---|
| Gemini 3.1 Pro | Excels at visual/frontend tasks. Different reasoning style. Default for visual-engineering and artistry. |
| Gemini 3 Flash | Fast. Good for doc search and light tasks. |
| GPT-5.4 Mini Fast | Default for Explore and Librarian agents. Blazing-fast reasoning-capable mini model. |
| MiniMax M2.7 | Fast and smart. Used in OpenCode Go and OpenCode Zen utility fallback chains. |
| MiniMax M2.7 Highspeed | High-speed OpenCode catalog entry used in utility fallback chains that prefer the fastest available MiniMax path. |
A premium subscription tier ($10/month) that provides reliable access to Chinese frontier models through OpenCode's infrastructure.
Available Models:
| Model | Use Case |
|---|---|
| opencode-go/kimi-k2.6 | Vision-capable, Claude-like reasoning. Used by Sisyphus, Atlas, Sisyphus-Junior, Multimodal Looker. |
| opencode-go/glm-5.1 | Text-only orchestration model. Used by Oracle, Prometheus, Metis, Momus. |
| opencode-go/minimax-m2.7 | Ultra-cheap, fast responses. Used by Atlas, Sisyphus-Junior, Explore and Librarian fallbacks for utility work. |
| opencode-go/qwen3.5-plus | Qwen coding model used as the first OpenCode Go utility fallback for Explore and Librarian when GPT-5.4 Mini Fast is unavailable. |
When It Gets Used:
OpenCode Go models appear throughout the fallback chains as intermediate options. Depending on the agent, they can sit before GPT, after GPT, or act as the last structured-model fallback before cheaper utility paths.
Go-Only Scenarios:
Some model identifiers in fallback chains are provider-specific aliases. For example, k2p5 resolves through kimi-for-coding, while glm-5 can resolve through zai-coding-plan, opencode, or vercel depending on availability.
You may see model names like kimi-k2.5-free, minimax-m2.7, minimax-m2.7-highspeed, or big-pickle (GLM 4.6) in the source code or logs. These are provider-specific or speed-optimized entries in fallback chains.
You don't need to configure them. The system includes them so it degrades gracefully when you don't have every paid subscription. If you have the paid version, the paid version is always preferred.
When agents delegate work, they don't pick a model name — they pick a category. The category maps to the right model automatically.
| Category | Used For | Default Model | Fallback Chain |
|---|---|---|---|
visual-engineering | Frontend, UI, CSS, design | google/gemini-3.1-pro (high) | Gemini → zai-coding-plan/glm-5 → claude-opus-4-7 (max) → opencode-go/glm-5.1 → kimi-for-coding/k2p5 |
artistry | Creative, novel approaches | google/gemini-3.1-pro (high) | Gemini → claude-opus-4-7 (max) → gpt-5.5 |
ultrabrain | Maximum reasoning needed | openai/gpt-5.5 (xhigh) | GPT-5.5 xhigh → gemini-3.1-pro (high) → claude-opus-4-7 (max) → opencode-go/glm-5.1 |
deep | Deep coding, complex logic | openai/gpt-5.5 (medium) | GPT-5.5 → claude-opus-4-7 (max) → gemini-3.1-pro (high) |
quick | Simple, fast tasks | openai/gpt-5.4-mini | GPT-5.4-mini → claude-haiku-4-5 → gemini-3-flash → opencode-go/minimax-m2.7 → opencode/gpt-5-nano |
unspecified-high | General complex work | anthropic/claude-opus-4-7 (max) | Opus → gpt-5.5 (high) → zai-coding-plan/glm-5 → kimi-for-coding/k2p5 → opencode-go/glm-5.1 → opencode/kimi-k2.5 → moonshotai/kimi-k2.5 |
unspecified-low | General standard work | anthropic/claude-sonnet-4-6 | Sonnet → gpt-5.3-codex (medium) → opencode-go/kimi-k2.6 → google/gemini-3-flash → opencode-go/minimax-m2.7 |
writing | Text, docs, prose | kimi-for-coding/k2p5 | gemini-3-flash → opencode-go/kimi-k2.6 → claude-sonnet-4-6 → opencode-go/minimax-m2.7 |
See the Orchestration System Guide for how agents dispatch tasks to categories.
src/shared/model-requirements.ts includes vercel on nearly every gateway-compatible fallback entry across both agent and category chains. Treat it as a universal extra provider path for the listed model IDs, not as a different model family.
{
"$schema": "https://raw.githubusercontent.com/code-yeongyu/oh-my-openagent/dev/assets/oh-my-opencode.schema.json",
"agents": {
// Sisyphus: Kimi K2.6 is the top alternative to Claude for orchestration
"sisyphus": {
"model": "opencode-go/kimi-k2.6",
"ultrawork": { "model": "opencode-go/kimi-k2.6" },
},
// Hephaestus: needs GPT. ChatGPT Plus gets you here.
"hephaestus": { "model": "openai/gpt-5.5", "variant": "medium" },
// Architecture consultation: GPT or Claude Opus
"oracle": { "model": "openai/gpt-5.5", "variant": "high" },
// Prometheus inherits Sisyphus behavior
"prometheus": { "model": "opencode-go/kimi-k2.6" },
// Atlas also communicative — Kimi works great
"atlas": { "model": "opencode-go/kimi-k2.6" },
// Utility agents stay cheap
"explore": { "model": "opencode-go/qwen3.5-plus" },
"librarian": { "model": "opencode-go/qwen3.5-plus" },
},
"categories": {
"visual-engineering": { "model": "opencode-go/qwen3.6-plus" }, // Qwen as Gemini alt
"deep": { "model": "openai/gpt-5.5", "variant": "medium" },
"ultrabrain": { "model": "openai/gpt-5.5", "variant": "xhigh" },
"quick": { "model": "openai/gpt-5.4-mini" },
"unspecified-low": { "model": "opencode-go/kimi-k2.6" },
"unspecified-high": { "model": "opencode-go/kimi-k2.6" },
"writing": { "model": "opencode-go/kimi-k2.6" },
},
"background_task": {
"providerConcurrency": {
"openai": 3,
"opencode-go": 10,
},
},
}
Highest quality, highest cost. No surprises.
{
"agents": {
"sisyphus": {
"model": "anthropic/claude-opus-4-7",
"variant": "max",
},
"hephaestus": { "model": "openai/gpt-5.5", "variant": "medium" },
"oracle": { "model": "openai/gpt-5.5", "variant": "high" },
},
"categories": {
"visual-engineering": { "model": "google/gemini-3.1-pro", "variant": "high" },
"deep": { "model": "openai/gpt-5.5", "variant": "medium" },
"unspecified-high": { "model": "anthropic/claude-opus-4-7", "variant": "max" },
},
}
Cheapest full-stack path. Hephaestus won't activate — accept that trade-off.
{
"agents": {
"sisyphus": { "model": "opencode-go/kimi-k2.6" },
"atlas": { "model": "opencode-go/kimi-k2.6" },
// Omit hephaestus entirely; it needs GPT.
"oracle": { "model": "opencode-go/glm-5.1" }, // Degraded but functional
"explore": { "model": "opencode-go/qwen3.5-plus" },
"librarian": { "model": "opencode-go/qwen3.5-plus" },
},
"categories": {
"visual-engineering": { "model": "opencode-go/qwen3.6-plus" },
"deep": { "model": "opencode-go/kimi-k2.6" }, // Not ideal — Kimi isn't GPT, but best available
"unspecified-high": { "model": "opencode-go/kimi-k2.6" },
"unspecified-low": { "model": "opencode-go/kimi-k2.6" },
"quick": { "model": "opencode-go/minimax-m2.7" },
"writing": { "model": "opencode-go/kimi-k2.6" },
},
}
If you have OpenRouter and want DeepSeek in the chain when GPT is unavailable:
{
"agents": {
"oracle": {
"model": "openai/gpt-5.5",
"variant": "high",
"fallback_models": [
"anthropic/claude-opus-4-7",
{ "model": "openrouter/deepseek/deepseek-v3.2", "temperature": 0.7 },
"opencode-go/glm-5.1",
],
},
},
}
fallback_models accepts a mix of plain model strings and per-fallback objects with variant, reasoningEffort, temperature, top_p, maxTokens, thinking.
Safe — same personality type:
Dangerous — personality mismatch:
visual-engineering → Kimi/GLM: Wrong reasoning style. Use Qwen if Gemini is unavailable, not Claude-likes.Each agent has a fallback chain. The system tries models in priority order until it finds one available through your connected providers. You don't need to configure providers per model. Just authenticate (opencode auth login) and the system figures out which models are available and where.
Resolution pipeline (from src/shared/model-resolution-pipeline.ts):
1. Override → User's explicit config or UI-selected model (primary agents only)
2. Category default → From category config (when agent has category set)
3. User fallback_models → Configured strings/objects tried before hardcoded chain
4. Provider fallback → AGENT_MODEL_REQUIREMENTS / CATEGORY_MODEL_REQUIREMENTS
5. System default → Ultimate safety net
Core-agent tab cycling is deterministic via injected runtime order field. The fixed priority order is Sisyphus (order: 0), Hephaestus (order: 1), Prometheus (order: 2), and Atlas (order: 3), then the remaining agents follow.
Your explicit configuration always wins. If you set a specific model for an agent, that choice takes precedence even when resolution data is cold.
Variant and reasoningEffort overrides are normalized to model-supported values, so cross-provider overrides degrade gracefully instead of failing hard.
Model capabilities are models.dev-backed, with a refreshable cache and capability diagnostics. Use bunx oh-my-opencode refresh-model-capabilities to update the cache, or configure model_capabilities.auto_refresh_on_start to refresh at startup.
To see which models your agents will actually use, run bunx oh-my-opencode doctor. This shows effective model resolution based on your current authentication and config.
Agent Request → User Override (if configured) → Fallback Chain → System Default
You can load agent system prompts from external files using file:// URLs in the prompt field, or append additional content with prompt_append. The prompt_append field also works on categories.
{
"agents": {
"sisyphus": {
"prompt": "file:///path/to/custom-prompt.md",
},
"oracle": {
"prompt_append": "file:///path/to/additional-context.md",
},
},
"categories": {
"deep": {
"prompt_append": "file:///path/to/deep-category-append.md",
},
},
}
The file content is loaded at runtime and injected into the agent's system prompt. Supports ~ expansion for home directory and relative file:// paths.
src/shared/model-requirements.ts — Source of truth for fallback chains