docs/smart-routing-spec.md
Status: Implemented Author: Microwave Date: 2026-02-19
Automatic model selection based on request complexity. The router analyzes each user message and selects an appropriate model tier (flash/standard/pro/frontier), then maps that tier to a configured model.
User Message
│
▼
┌──────────────────┐
│ Pattern Overrides │ ← Fast-path for obvious cases (greetings, security audits)
└────────┬─────────┘
│ no match
▼
┌──────────────────┐
│ Complexity Scorer │ ← 13-dimension analysis
└────────┬─────────┘
│ score 0-100
▼
┌──────────────────┐
│ Tier Mapping │ ← 0-15: flash, 16-40: standard, 41-65: pro, 66+: frontier
└────────┬─────────┘
│ tier
▼
┌──────────────────┐
│ Model Selection │ ← Currently: cheap provider (Flash/Standard/Pro) vs primary (Frontier)
└────────┬─────────┘ Target: per-tier model mapping via config
│
▼
LLM Provider
Each dimension produces a 0-100 score. Weighted sum determines total.
| Dimension | Weight | Signals |
|---|---|---|
| Reasoning Words | 14% | "why", "explain", "compare", "trade-offs" |
| Token Estimate | 12% | Prompt length |
| Code Indicators | 10% | Backticks, syntax, "implement", "PR" |
| Multi-Step | 10% | "first", "then", "after", "steps" |
| Domain Specific | 10% | Technical terms (configurable) |
| Creativity | 7% | "write", "summarize", "tweet", "blog" |
| Question Complexity | 7% | Multiple questions, open-ended starters |
| Precision | 6% | Numbers, "exactly", "calculate" |
| Ambiguity | 5% | Vague references |
| Context Dependency | 5% | "previous", "you said" |
| Sentence Complexity | 5% | Commas, conjunctions, clause depth |
| Tool Likelihood | 5% | "read", "deploy", "install" |
| Safety Sensitivity | 4% | "password", "auth", "vulnerability" |
Multi-dimensional boost: +30% when 3+ dimensions score above threshold.
| Score | Tier | Typical Use Case |
|---|---|---|
| 0-15 | flash | Greetings, acknowledgments, quick lookups |
| 16-40 | standard | Writing, comparisons, defined tasks |
| 41-65 | pro | Multi-step analysis, code review |
| 66+ | frontier | Critical decisions, security audits |
Fast-path rules that bypass scoring for obvious cases:
# Force flash tier
- "^(hi|hello|hey|thanks|ok|sure|yes|no)$"
- "^what.*(time|date|day)"
# Force frontier tier
- "security.*(audit|review|scan)"
- "vulnerabilit(y|ies).*(review|scan|check|audit)"
# Force pro tier
- "deploy.*(mainnet|production)"
Note: The current implementation supports smart routing via
NEARAI_CHEAP_MODELandSMART_ROUTING_CASCADEenv vars, plusdomain_keywordsonSmartRoutingConfig. The fullllm.routingYAML schema below is the target design — not all knobs are wired yet.
Default (zero-config):
llm:
routing:
enabled: true # default
Power user overrides (target schema):
llm:
routing:
enabled: true
tiers:
flash: "claude-3-5-haiku-latest"
standard: "claude-sonnet-4-5-latest"
pro: "claude-sonnet-4-5-latest"
frontier: "claude-opus-4-5-latest"
thinking:
pro: "low"
frontier: "medium"
overrides:
- pattern: "my-custom-pattern"
tier: "pro"
domain_keywords: # Custom keywords for your domain
- "mycompany"
- "myproduct"
- "internal-tool"
If domain_keywords is not set, uses DEFAULT_DOMAIN_KEYWORDS which covers common web3/infra terms.
Disable routing (pin model):
llm:
routing:
enabled: false
model: "claude-opus-4-5"
Bring your own keys:
llm:
backend: anthropic
api_key: "sk-..."
routing:
enabled: true # still works with external providers
LlmProvider trait (like FailoverProvider)LlmConfig with routing sectionCritical: No hardcoded model names in the router logic itself.
-latest patterns where supported| Layer | User Type | Config |
|---|---|---|
| 1. Zero-config | Everyone | routing.enabled: true (default) |
| 2. Tier tuning | Power users | Custom routing.tiers mapping |
| 3. Pattern overrides | Power users | Custom routing.overrides |
| 4. Model pinning | Power users | routing.enabled: false + model: X |
| 5. Own API keys | Power users | backend: anthropic + api_key |
src/llm/smart_routing.rs)src/llm/smart_routing.rs)src/config.rs)src/llm/mod.rs)