docs/5-CONFIGURATION/ai-providers.md
Complete setup instructions for each AI provider via the Settings UI.
New in v1.2: All AI provider credentials are now managed through the Settings UI. Environment variables for API keys are deprecated.
Open Notebook uses a credential-based system for managing AI providers:
Prerequisite: You must set
OPEN_NOTEBOOK_ENCRYPTION_KEYin your docker-compose.yml before storing credentials. See API Configuration for details.
Cost: ~$0.03-0.15 per 1K tokens (varies by model)
Get Your API Key:
Configure in Open Notebook:
Available Models (in Open Notebook):
gpt-4o — Best quality, fast (latest version)gpt-4o-mini — Fast, cheap, good for testingo1 — Advanced reasoning model (slower, more expensive)o1-mini — Faster reasoning modelRecommended:
gpt-4o (best balance)gpt-4o-mini (90% cheaper)o1 (best for hard problems)Cost Estimate:
Light use: $1-5/month
Medium use: $10-30/month
Heavy use: $50-100+/month
Troubleshooting:
Cost: ~$0.80-3.00 per 1M tokens (cheaper than OpenAI for long context)
Get Your API Key:
Configure in Open Notebook:
Available Models:
claude-sonnet-4-5-20250929 — Latest, best quality (recommended)claude-3-5-sonnet-20241022 — Previous generation, still excellentclaude-3-5-haiku-20241022 — Fast, cheapclaude-opus-4-5-20251101 — Most powerful, expensiveRecommended:
claude-sonnet-4-5 (best overall, latest)claude-3-5-haiku (80% cheaper)claude-opus-4-5 (most capable)Cost Estimate:
Sonnet: $3-20/month (typical use)
Haiku: $0.50-3/month
Opus: $10-50+/month
Advantages:
Troubleshooting:
Cost: ~$0.075-0.30 per 1K tokens (competitive with OpenAI)
Get Your API Key:
Configure in Open Notebook:
Available Models:
gemini-2.0-flash-exp — Latest experimental, fastest (recommended)gemini-2.0-flash — Stable version, fast, cheapRecommended:
gemini-2.0-flash-exp (best value, latest)gemini-1.5-flash (very cheap)gemini-1.5-pro-latest (2M token context)Advantages:
Troubleshooting:
Cost: ~$0.05 per 1M tokens (cheapest, but limited models)
Get Your API Key:
Configure in Open Notebook:
Available Models:
llama-3.3-70b-versatile — Best on Groq (recommended)llama-3.1-70b-versatile — Fast, capablemixtral-8x7b-32768 — Good alternativegemma2-9b-it — Small, very fastRecommended:
llama-3.3-70b-versatile (best overall)gemma2-9b-it (ultra-fast)llama-3.1-70b-versatileAdvantages:
Disadvantages:
Troubleshooting:
Cost: Varies by model ($0.05-15 per 1M tokens)
Get Your API Key:
Configure in Open Notebook:
Available Models (100+ options):
openai/gpt-4o, openai/o1anthropic/claude-sonnet-4.5, anthropic/claude-3.5-haikugoogle/gemini-2.0-flash-exp, google/gemini-1.5-prometa-llama/llama-3.3-70b-instruct, meta-llama/llama-3.1-405b-instructmistralai/mistral-large-2411deepseek/deepseek-chatRecommended:
anthropic/claude-sonnet-4.5 (best overall)google/gemini-2.0-flash-exp (very fast, cheap)meta-llama/llama-3.3-70b-instructopenai/o1Advantages:
Cost Estimate:
Light use: $1-5/month
Medium use: $10-30/month
Heavy use: Depends on models chosen
Troubleshooting:
Cost: ~$0.01-0.06 per 1K tokens (varies by model)
Get Your API Key:
Configure in Open Notebook:
Available Models:
qwen-max — Most capable Qwen modelqwen-plus — Good balance of quality and speedqwen-turbo — Fastest, cheapestRecommended:
qwen-max (best overall)qwen-plus (good balance)qwen-turbo (cheapest)Troubleshooting:
Cost: Varies by model
Get Your API Key:
Configure in Open Notebook:
Available Models:
MiniMax-M2.5 — Most capable, 204K contextMiniMax-M2.5-highspeed — Faster variant, 204K contextRecommended:
MiniMax-M2.5 (best overall)MiniMax-M2.5-highspeed (faster responses)Advantages:
Troubleshooting:
Cost: Free (electricity only)
Setup Ollama:
ollama serveollama pull mistralConfigure in Open Notebook:
http://localhost:11434http://host.docker.internal:11434http://ollama:11434See Ollama Setup Guide for detailed network configuration.
Available Models:
llama3.3:70b — Best quality (requires 40GB+ RAM)llama3.1:8b — Recommended, balanced (8GB RAM)qwen2.5:7b — Excellent for code and reasoningmistral:7b — Good general purposephi3:3.8b — Small, fast (4GB RAM)gemma2:9b — Google's model, balancedollama list to see availableRecommended:
llama3.3:70b (best)llama3.1:8b (best balance)phi3:3.8b (very fast)qwen2.5:7b (excellent at code)Hardware Requirements:
GPU (NVIDIA/AMD):
8GB VRAM: Runs most models fine
6GB VRAM: Works, slower
4GB VRAM: Small models only
CPU-only:
16GB+ RAM: Slow but works
8GB RAM: Very slow
4GB RAM: Not recommended
Advantages:
Disadvantages:
Troubleshooting:
ollama pull modelnameCost: Free
Setup LM Studio:
Configure in Open Notebook:
http://host.docker.internal:1234/v1 (Docker) or http://localhost:1234/v1 (local)lm-studio (placeholder, LM Studio doesn't require one)Advantages:
Disadvantages:
For Text Generation UI, vLLM, or other OpenAI-compatible endpoints:
http://localhost:8000/v1)See OpenAI-Compatible Setup for detailed instructions.
Cost: Same as OpenAI (usage-based)
Configure in Open Notebook:
Advantages:
Disadvantages:
By default, Open Notebook uses the LLM provider's embeddings. Embedding models are discovered and registered through the same credential system — when you discover models from a credential, embedding models are included alongside language models.
1. Don't want to run locally and don't want to mess around with different providers:
Use OpenAI
For budget-conscious: Groq, OpenRouter or Ollama
For privacy-first: Ollama or LM Studio and Speaches (TTS, STT)
For enterprise: Azure OpenAI
OPEN_NOTEBOOK_ENCRYPTION_KEY in your docker-compose.yml (required for storing credentials)Multiple providers: You can add credentials for as many providers as you want. Create separate credentials for different projects or team members.
Done!
Deprecated: Configuring AI provider API keys via environment variables is deprecated. Use the Settings UI instead. Environment variables may still work as a fallback but are no longer the recommended approach.
If you are migrating from an older version that used environment variables, go to Settings → API Keys and click the Migrate to Database button to import your existing keys into the credential system.