.forge/skills/test-reasoning/SKILL.md
Validates that ReasoningConfig fields are correctly serialized into provider-specific JSON
for OpenRouter, Anthropic, GitHub Copilot, and Codex.
Run all tests with the bundled script:
./scripts/test-reasoning.sh
The script builds forge in debug mode, runs each provider/model combination, captures the
outgoing HTTP request body via FORGE_DEBUG_REQUESTS, and asserts the correct JSON fields.
FORGE_DEBUG_REQUESTS="forge.request.json" \
FORGE_SESSION__PROVIDER_ID=<provider_id> \
FORGE_SESSION__MODEL_ID=<model_id> \
FORGE_REASONING__EFFORT=<effort> \
target/debug/forge -p "Hello!"
Then inspect .forge/forge.request.json for the expected fields.
| Provider | Model | Config fields | Expected JSON field |
|---|---|---|---|
open_router | openai/o4-mini | effort: none|minimal|low|medium|high|xhigh | reasoning.effort |
open_router | openai/o4-mini | max_tokens: 4000 | reasoning.max_tokens |
open_router | openai/o4-mini | effort: high + exclude: true | reasoning.effort + .exclude |
open_router | openai/o4-mini | enabled: true | reasoning.enabled |
open_router | anthropic/claude-opus-4-5 | max_tokens: 4000 | reasoning.max_tokens |
open_router | moonshotai/kimi-k2 | max_tokens: 4000 | reasoning.max_tokens |
open_router | moonshotai/kimi-k2 | effort: high | reasoning.effort |
open_router | minimax/minimax-m2 | max_tokens: 4000 | reasoning.max_tokens |
open_router | minimax/minimax-m2 | effort: high | reasoning.effort |
anthropic | claude-opus-4-6 | effort: low|medium|high|max | output_config.effort |
anthropic | claude-3-7-sonnet-20250219 | enabled: true + max_tokens: 8000 | thinking.type + budget_tokens |
github_copilot | o4-mini | effort: none|minimal|low|medium|high|xhigh | reasoning_effort (top-level) |
codex | gpt-5.1-codex | effort: none|minimal|low|medium|high|xhigh | reasoning.effort + .summary |
codex | gpt-5.1-codex | effort: medium + exclude: true | reasoning.summary = "concise" |
| all providers | one model each | effort: invalid | non-zero exit, no request written |
Tests for unconfigured providers are skipped automatically. Invalid-effort tests run regardless of credentials — the rejection happens at config parse time before any provider interaction.