docs/public/usage/gemini-provider.mdx
Claude-mem supports Google's Gemini API as an alternative to the Claude Agent SDK for extracting observations from your sessions. This can significantly reduce costs since Gemini offers a generous free tier.
<Warning> **Free Tier Rate Limits**: Without billing enabled, Gemini has strict rate limits (5-10 RPM). Enable billing on your Google Cloud project to unlock 1000-4000 RPM while still using the free quota. </Warning>| Setting | Values | Default | Description |
|---|---|---|---|
CLAUDE_MEM_PROVIDER | claude, gemini | claude | AI provider for observation extraction |
CLAUDE_MEM_GEMINI_API_KEY | string | — | Your Gemini API key |
CLAUDE_MEM_GEMINI_MODEL | gemini-2.5-flash-lite, gemini-2.5-flash, gemini-3-flash-preview | gemini-2.5-flash-lite | Gemini model to use |
CLAUDE_MEM_GEMINI_BILLING_ENABLED | true, false | false | Skip rate limiting if billing is enabled on Google Cloud |
Settings are applied immediately—no restart required.
Edit ~/.claude-mem/settings.json:
{
"CLAUDE_MEM_PROVIDER": "gemini",
"CLAUDE_MEM_GEMINI_API_KEY": "your-api-key-here",
"CLAUDE_MEM_GEMINI_MODEL": "gemini-2.5-flash-lite",
"CLAUDE_MEM_GEMINI_BILLING_ENABLED": "true"
}
Alternatively, set the API key via environment variable:
export GEMINI_API_KEY="your-api-key-here"
The settings file takes precedence over the environment variable.
| Model | Free Tier RPM | Notes |
|---|---|---|
gemini-2.5-flash-lite | 10 | Default, recommended for free tier (highest RPM) |
gemini-2.5-flash | 5 | Higher capability, lower rate limit |
gemini-3-flash-preview | 5 | Latest model, lower rate limit |
You can switch between Claude and Gemini at any time:
{
"CLAUDE_MEM_PROVIDER": "gemini"
}
If Gemini is selected and the API errors, claude-mem logs the failure and re-throws so the message stays pending for later retry. There is no Claude SDK fallback — earlier docs claimed automatic Claude fallback, but the wiring was never actually engaged in production (#2087). To switch providers, change CLAUDE_MEM_PROVIDER in settings.
Throwing conditions:
Either:
CLAUDE_MEM_GEMINI_API_KEY in ~/.claude-mem/settings.json, orGEMINI_API_KEY environment variableGoogle has two rate limit tiers for free usage:
Without billing (API key only):
| Model | RPM | TPM |
|---|---|---|
| gemini-2.5-flash-lite | 10 | 250K |
| gemini-2.5-flash | 5 | 250K |
| gemini-3-flash-preview | 5 | 250K |
Claude-mem enforces these limits automatically with built-in delays between requests. Processing may be slower but stays within limits.
With billing enabled (still free tier):
| Model | RPM | TPM |
|---|---|---|
| gemini-2.5-flash-lite | 4,000 | 4M |
| gemini-2.5-flash | 1,000 | 1M |
| gemini-3-flash-preview | 1,000 | 1M |
If you hit rate limits:
If observations seem lower quality with Gemini: