docs/public/usage/openrouter-provider.mdx
Claude-mem supports OpenRouter as an alternative provider for observation extraction. OpenRouter provides a unified API to access 100+ models from different providers including Google, Meta, Mistral, DeepSeek, and many others—often with generous free tiers.
<Tip> **Free Models Available**: OpenRouter offers several completely free models, making it an excellent choice for reducing observation extraction costs to zero while maintaining quality. </Tip>OpenRouter actively supports democratizing AI access by offering free models. These are production-ready models suitable for observation extraction.
| Model | ID | Parameters | Context | Best For |
|---|---|---|---|---|
| Xiaomi MiMo-V2-Flash | xiaomi/mimo-v2-flash:free | 309B (15B active, MoE) | 256K | Reasoning, coding, agents |
| Gemini 2.0 Flash | google/gemini-2.0-flash-exp:free | — | 1M | General purpose |
| Gemini 2.5 Flash | google/gemini-2.5-flash-preview:free | — | 1M | Latest capabilities |
| DeepSeek R1 | deepseek/deepseek-r1:free | 671B | 64K | Reasoning, analysis |
| Llama 3.1 70B | meta-llama/llama-3.1-70b-instruct:free | 70B | 128K | General purpose |
| Llama 3.1 8B | meta-llama/llama-3.1-8b-instruct:free | 8B | 128K | Fast, lightweight |
| Mistral Nemo | mistralai/mistral-nemo:free | 12B | 128K | Efficient performance |
All free models support:
| Setting | Values | Default | Description |
|---|---|---|---|
CLAUDE_MEM_PROVIDER | claude, gemini, openrouter | claude | AI provider for observation extraction |
CLAUDE_MEM_OPENROUTER_API_KEY | string | — | Your OpenRouter API key |
CLAUDE_MEM_OPENROUTER_MODEL | string | xiaomi/mimo-v2-flash:free | Model identifier (see list above) |
CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGES | number | 20 | Max messages in conversation history |
CLAUDE_MEM_OPENROUTER_MAX_TOKENS | number | 100000 | Token budget safety limit |
CLAUDE_MEM_OPENROUTER_SITE_URL | string | — | Optional: URL for analytics attribution |
CLAUDE_MEM_OPENROUTER_APP_NAME | string | claude-mem | Optional: App name for analytics |
Settings are applied immediately—no restart required.
Edit ~/.claude-mem/settings.json:
{
"CLAUDE_MEM_PROVIDER": "openrouter",
"CLAUDE_MEM_OPENROUTER_API_KEY": "sk-or-v1-your-key-here",
"CLAUDE_MEM_OPENROUTER_MODEL": "xiaomi/mimo-v2-flash:free"
}
Alternatively, set the API key via environment variable:
export OPENROUTER_API_KEY="sk-or-v1-your-key-here"
The settings file takes precedence over the environment variable.
Recommended: xiaomi/mimo-v2-flash:free
Alternatives:
google/gemini-2.0-flash-exp:free - 1M context, Google's flagshipdeepseek/deepseek-r1:free - Excellent reasoning capabilitiesmeta-llama/llama-3.1-70b-instruct:free - Strong general purpose| Model | Price (per 1M tokens) | Best For |
|---|---|---|
anthropic/claude-3.5-sonnet | $3 in / $15 out | Highest quality observations |
google/gemini-2.0-flash | $0.075 in / $0.30 out | Fast, cost-effective |
openai/gpt-4o | $2.50 in / $10 out | GPT-4 quality |
OpenRouter agent implements intelligent context management to prevent runaway costs:
The agent uses a sliding window strategy:
MAX_CONTEXT_MESSAGES (default: 20)MAX_TOKENS (default: 100,000)Logs include detailed usage information:
OpenRouter API usage: {
model: "xiaomi/mimo-v2-flash:free",
inputTokens: 2500,
outputTokens: 1200,
totalTokens: 3700,
estimatedCostUSD: "0.00",
messagesInContext: 8
}
You can switch between providers at any time:
{
"CLAUDE_MEM_PROVIDER": "openrouter"
}
If OpenRouter errors, claude-mem logs the failure and re-throws so the message stays pending for later retry. There is no Claude SDK fallback — earlier docs claimed automatic Claude fallback, but the wiring was never actually engaged in production (#2087). To switch providers, change CLAUDE_MEM_PROVIDER in settings.
Throwing conditions:
OpenRouter agent maintains full conversation history across API calls:
Session Created
↓
Load Pending Messages (observations from queue)
↓
For each message:
→ Add to conversation history
→ Call OpenRouter API with FULL history
→ Parse XML response
→ Store observations in database
→ Sync to Chroma vector DB
↓
Session complete
This enables:
Either:
CLAUDE_MEM_OPENROUTER_API_KEY in ~/.claude-mem/settings.json, orOPENROUTER_API_KEY environment variableFree models may have rate limits during peak usage. If you hit rate limits:
Verify the model ID is correct:
:free suffix for free model variantsIf you see warnings about high token usage (>50,000 per request):
CLAUDE_MEM_OPENROUTER_MAX_CONTEXT_MESSAGESCLAUDE_MEM_OPENROUTER_MAX_TOKENSIf you see connection errors:
OpenRouter uses an OpenAI-compatible REST API:
Endpoint: https://openrouter.ai/api/v1/chat/completions
Headers:
Authorization: Bearer {apiKey}
HTTP-Referer: https://github.com/thedotmack/claude-mem
X-Title: claude-mem
Content-Type: application/json
Request Format:
{
"model": "xiaomi/mimo-v2-flash:free",
"messages": [
{"role": "system", "content": "..."},
{"role": "user", "content": "..."}
],
"temperature": 0.3,
"max_tokens": 4096
}
| Feature | Claude (SDK) | Gemini | OpenRouter |
|---|---|---|---|
| Cost | Pay per token | Free tier + paid | Free models + paid |
| Models | Claude only | Gemini only | 100+ models |
| Quality | Highest | High | Varies by model |
| Rate limits | Based on tier | 5-4000 RPM | Varies by model |
| On error | Throws | Throws | Throws |
| Setup | Automatic | API key required | API key required |