docs/book/src/providers/fallback-and-routing.md
Two meta-providers let you stack concrete providers into reliability pipelines and task-aware routing.
reliableA reliable provider wraps a list of other providers. On timeout, network error, or authentication failure on the first, it transparently falls through to the next.
[providers.models.primary]
kind = "reliable"
fallback_providers = ["claude", "haiku", "local"]
# fallback_providers references other [providers.models.*] by name
[providers.models.claude]
kind = "anthropic"
model = "claude-sonnet-4-6"
api_key = "sk-ant-..."
[providers.models.haiku]
kind = "anthropic"
model = "claude-haiku-4-5-20251001"
api_key = "sk-ant-..."
[providers.models.local]
kind = "ollama"
base_url = "http://localhost:11434"
model = "qwen3.6:35b-a3b"
Then:
default_model = "primary"
Behaviour:
claude first; on transient error retries twice with exponential backoff (2 s, 4 s).haiku and restarts the retry cycle.haiku also exhausts, falls through to local.Best practice — order by reliability: put the most reliable provider first, not the best. A Claude fallback behind a flaky local Ollama won't save you during an API outage. [fastest/best_quality, fallback_quality, always_on_local] is a common pattern.
routerA router provider picks a backend per request based on hints supplied by the caller.
[providers.models.brain]
kind = "router"
default = "haiku" # used if no hint matches
routes = [
{ hint = "reasoning", provider = "deepseek-r1" },
{ hint = "cheap", provider = "haiku" },
{ hint = "vision", provider = "gemini" },
]
Channels, tools, and SOPs can emit hints via request metadata. For example, an SOP step might request hint:reasoning for a planning phase and hint:cheap for a summarisation phase. Everything else goes to default.
reliable and router compose. Routes can point at reliable providers, and reliable can wrap a router.
[providers.models.production]
kind = "reliable"
fallback_providers = ["brain", "local"] # if routing fails, fall to local Ollama
[providers.models.brain]
kind = "router"
default = "haiku"
routes = [
{ hint = "reasoning", provider = "sonnet" },
{ hint = "cheap", provider = "haiku" },
]
Three common patterns users pick:
fallback_providers = ["haiku", "gpt-4o-mini", "local"]
Cheapest hosted first, local as the final safety net.
fallback_providers = ["opus", "sonnet", "haiku"]
Best model first; fall to cheaper Claude models on failure.
routes = [
{ hint = "code", provider = "claude-code" },
{ hint = "reasoning", provider = "deepseek-r1" },
{ hint = "multimodal", provider = "gemini" },
]
default = "haiku"
Match the tool to the job.
Fallback events and routing decisions are logged via the infra crate:
INFO provider=claude attempt=1 → timeout
INFO provider=claude attempt=2 → timeout
WARN provider=claude exhausted, falling back → provider=haiku
INFO router hint=reasoning → provider=deepseek-r1
For production deployments, wire the log output to Loki / Grafana. See Operations → Logs & observability.