docs/5-CONFIGURATION/ai-providers.md
Complete setup instructions for each AI provider via the Settings UI.
New in v1.2: All AI provider credentials are now managed through the Settings UI. Environment variables for API keys are deprecated.
Open Notebook uses a credential-based system for managing AI providers:
Prerequisite: You must set
OPEN_NOTEBOOK_ENCRYPTION_KEYin your docker-compose.yml before storing credentials. See API Configuration for details.
Cost: ~$0.03-0.15 per 1K tokens (varies by model)
Get Your API Key:
Configure in Open Notebook:
Available Models (in Open Notebook):
gpt-4o — Best quality, fast (latest version)gpt-4o-mini — Fast, cheap, good for testingo1 — Advanced reasoning model (slower, more expensive)o1-mini — Faster reasoning modelRecommended:
gpt-4o (best balance)gpt-4o-mini (90% cheaper)o1 (best for hard problems)Cost Estimate:
Light use: $1-5/month
Medium use: $10-30/month
Heavy use: $50-100+/month
Troubleshooting:
Cost: ~$0.80-3.00 per 1M tokens (cheaper than OpenAI for long context)
Get Your API Key:
Configure in Open Notebook:
Available Models:
claude-sonnet-4-5-20250929 — Latest, best quality (recommended)claude-3-5-sonnet-20241022 — Previous generation, still excellentclaude-3-5-haiku-20241022 — Fast, cheapclaude-opus-4-5-20251101 — Most powerful, expensiveRecommended:
claude-sonnet-4-5 (best overall, latest)claude-3-5-haiku (80% cheaper)claude-opus-4-5 (most capable)Cost Estimate:
Sonnet: $3-20/month (typical use)
Haiku: $0.50-3/month
Opus: $10-50+/month
Advantages:
Troubleshooting:
Cost: ~$0.075-0.30 per 1K tokens (competitive with OpenAI)
Get Your API Key:
Configure in Open Notebook:
Available Models:
gemini-2.0-flash-exp — Latest experimental, fastest (recommended)gemini-2.0-flash — Stable version, fast, cheapRecommended:
gemini-2.0-flash-exp (best value, latest)gemini-1.5-flash (very cheap)gemini-1.5-pro-latest (2M token context)Advantages:
Troubleshooting:
Cost: ~$0.05 per 1M tokens (cheapest, but limited models)
Get Your API Key:
Configure in Open Notebook:
Available Models:
llama-3.3-70b-versatile — Best on Groq (recommended)llama-3.1-70b-versatile — Fast, capablemixtral-8x7b-32768 — Good alternativegemma2-9b-it — Small, very fastRecommended:
llama-3.3-70b-versatile (best overall)gemma2-9b-it (ultra-fast)llama-3.1-70b-versatileAdvantages:
Disadvantages:
Troubleshooting:
Cost: Varies by model ($0.05-15 per 1M tokens)
Get Your API Key:
Configure in Open Notebook:
Available Models (100+ options):
openai/gpt-4o, openai/o1anthropic/claude-sonnet-4.5, anthropic/claude-3.5-haikugoogle/gemini-2.0-flash-exp, google/gemini-1.5-prometa-llama/llama-3.3-70b-instruct, meta-llama/llama-3.1-405b-instructmistralai/mistral-large-2411deepseek/deepseek-chatRecommended:
anthropic/claude-sonnet-4.5 (best overall)google/gemini-2.0-flash-exp (very fast, cheap)meta-llama/llama-3.3-70b-instructopenai/o1Advantages:
Cost Estimate:
Light use: $1-5/month
Medium use: $10-30/month
Heavy use: Depends on models chosen
Troubleshooting:
Cost: ~$0.01-0.06 per 1K tokens (varies by model)
Get Your API Key:
Configure in Open Notebook:
Available Models:
qwen-max — Most capable Qwen modelqwen-plus — Good balance of quality and speedqwen-turbo — Fastest, cheapestRecommended:
qwen-max (best overall)qwen-plus (good balance)qwen-turbo (cheapest)Troubleshooting:
Cost: Varies by model
Get Your API Key:
Configure in Open Notebook:
Available Models:
MiniMax-M2.5 — Most capable, 204K contextMiniMax-M2.5-highspeed — Faster variant, 204K contextRecommended:
MiniMax-M2.5 (best overall)MiniMax-M2.5-highspeed (faster responses)Advantages:
Troubleshooting:
Cost: Free (electricity only)
Setup Ollama:
ollama serveollama pull mistralConfigure in Open Notebook:
http://localhost:11434http://host.docker.internal:11434http://ollama:11434See Ollama Setup Guide for detailed network configuration.
Context Window (num_ctx):
Ollama models default to a 8,192-token context window. This default is intentionally conservative so models run reliably on consumer GPUs (≈8GB VRAM) without running out of memory. If your hardware can handle more, set an optional Context Window (num_ctx) value on the Ollama credential (Settings → API Keys → edit the Ollama credential). It applies to all models that use that credential. Leave it empty to keep the default.
32768) when ingesting large documents or using long chat histories.Available Models:
llama3.3:70b — Best quality (requires 40GB+ RAM)llama3.1:8b — Recommended, balanced (8GB RAM)qwen2.5:7b — Excellent for code and reasoningmistral:7b — Good general purposephi3:3.8b — Small, fast (4GB RAM)gemma2:9b — Google's model, balancedollama list to see availableRecommended:
llama3.3:70b (best)llama3.1:8b (best balance)phi3:3.8b (very fast)qwen2.5:7b (excellent at code)Hardware Requirements:
GPU (NVIDIA/AMD):
8GB VRAM: Runs most models fine
6GB VRAM: Works, slower
4GB VRAM: Small models only
CPU-only:
16GB+ RAM: Slow but works
8GB RAM: Very slow
4GB RAM: Not recommended
Advantages:
Disadvantages:
Troubleshooting:
ollama pull modelnameCost: Free
Setup LM Studio:
Configure in Open Notebook:
http://host.docker.internal:1234/v1 (Docker) or http://localhost:1234/v1 (local)lm-studio (placeholder, LM Studio doesn't require one)Advantages:
Disadvantages:
For Text Generation UI, vLLM, or other OpenAI-compatible endpoints:
http://localhost:8000/v1)See OpenAI-Compatible Setup for detailed instructions.
Cost: Same as OpenAI (usage-based)
Configure in Open Notebook:
Advantages:
Disadvantages:
By default, Open Notebook uses the LLM provider's embeddings. Embedding models are discovered and registered through the same credential system — when you discover models from a credential, embedding models are included alongside language models.
1. Don't want to run locally and don't want to mess around with different providers:
Use OpenAI
For budget-conscious: Groq, OpenRouter or Ollama
For privacy-first: Ollama or LM Studio and Speaches (TTS, STT)
For enterprise: Azure OpenAI
OPEN_NOTEBOOK_ENCRYPTION_KEY in your docker-compose.yml (required for storing credentials)Multiple providers: You can add credentials for as many providers as you want. Create separate credentials for different projects or team members.
Done!
Deprecated: Configuring AI provider API keys via environment variables is deprecated. Use the Settings UI instead. Environment variables may still work as a fallback but are no longer the recommended approach.
If you are migrating from an older version that used environment variables, go to Settings → API Keys and click the Migrate to Database button to import your existing keys into the credential system.