docs/6-TROUBLESHOOTING/ai-chat-issues.md
Problems with AI models, chat, and response quality.
Note: Open Notebook now shows descriptive error messages for AI provider failures. Instead of a generic "An unexpected error occurred", you'll see specific messages like "Authentication failed. Please check your API key" or "Rate limit exceeded. Please wait a moment and try again." These messages help you diagnose and fix issues faster.
Symptom: Chat shows "Failed to send message" toast. Logs show:
Error executing chat: Model is not a LanguageModel: None
Cause: No valid language model configured for chat
Solutions:
1. Go to Settings → Models
2. Scroll to "Default Models" section
3. Verify "Default Chat Model" has a model selected
4. If empty, select an available language model
5. Click Save
# Get exact model names
ollama list
# Example output:
# NAME SIZE MODIFIED
# gemma3:12b 8.1 GB 2 months ago
# The model name in Open Notebook must be EXACTLY "gemma3:12b"
# NOT "gemma3" or "gemma3-12b"
1. Note the exact model names from your provider
2. Go to Settings → Models
3. Delete any misconfigured models
4. Add models with exact names
5. Set new defaults
# For Ollama: verify model is installed
ollama list
# For cloud providers: verify API key is valid
# and you have access to the model
Tip: This error often occurs when you delete a model from Ollama but forget to update the default models in Open Notebook. Always re-configure defaults after removing models.
Symptom: Settings → Models shows empty, or "No models configured"
Cause: No credential configured, or credential has invalid API key
Solutions:
1. Go to Settings → API Keys
2. Click "Add Credential"
3. Select your provider (e.g., OpenAI, Anthropic, Google)
4. Enter your API key
5. Click Save, then Test Connection
6. Click Discover Models → Register Models
7. Go to Settings → Models to verify
1. Go to Settings → API Keys
2. Click "Test Connection" on your credential
3. If it shows "Invalid API key":
- Get a fresh key from the provider's website
- Delete the credential and create a new one
1. Go to Settings → API Keys
2. Add a credential for a different provider
3. Test Connection → Discover Models → Register Models
4. Go to Settings → Models to select the new provider's models
Symptom: Error when trying to chat: "Invalid API key"
Cause: Credential has wrong, expired, or revoked API key
Solutions:
1. Go to Settings → API Keys
2. Click "Test Connection" on your credential
3. If it fails, proceed to Step 2
Go to provider's dashboard:
- OpenAI: https://platform.openai.com/api-keys (starts with sk-proj-)
- Anthropic: https://console.anthropic.com/ (starts with sk-ant-)
- Google: https://aistudio.google.com/app/apikey (starts with AIzaSy)
Generate new key and copy exactly (no extra spaces)
1. Go to Settings → API Keys
2. Delete the old credential
3. Click "Add Credential" → select provider
4. Paste the new key
5. Click Save, then Test Connection
6. Re-discover and register models if needed
1. Go to Settings → Models
2. Verify models are available
3. Try a test chat
Symptom: AI responses are shallow, generic, or wrong
Cause: Bad context, vague question, or wrong model
Solutions:
1. In Chat, click "Select Sources"
2. Verify sources you want are CHECKED
3. Set them to "Full Content" (not "Summary Only")
4. Click "Save"
5. Try chat again
Bad: "What do you think?"
Good: "Based on the paper's methodology, what are 3 limitations?"
Bad: "Tell me about X"
Good: "Summarize X in 3 bullet points with page citations"
OpenAI:
Current: gpt-4o-mini → Switch to: gpt-4o
Anthropic:
Current: claude-3-5-haiku → Switch to: claude-3-5-sonnet
To change:
1. Settings → Models
2. Select model
3. Try chat again
If: "Response seems incomplete"
Try: Add more relevant sources to provide context
Symptom: Chat responses take minutes
Cause: Large context, slow model, or overloaded API
Solutions:
Fastest: Groq (any model)
Fast: OpenAI gpt-4o-mini
Medium: Anthropic claude-3-5-haiku
Slow: Anthropic claude-3-5-sonnet
Switch in: Settings → Models
1. Chat → Select Sources
2. Uncheck sources you don't need
3. Or switch to "Summary Only" for background sources
4. Save and try again
# In .env:
API_CLIENT_TIMEOUT=600 # 10 minutes
# Restart:
docker compose restart
# See if API is overloaded:
docker stats
# If CPU >80% or memory >90%:
# Reduce: SURREAL_COMMANDS_MAX_TASKS=2
# Restart: docker compose restart
Symptom: Each message treated as separate, no context between questions
Cause: Chat history not saved or new chat started
Solution:
1. Make sure you're in same Chat (not new Chat)
2. Check Chat title at top
3. If it's blank, start new Chat with a title
4. Each named Chat keeps its history
5. If you start new Chat, history is separate
Symptom: Error: "Rate limit exceeded" or "Too many requests"
Cause: Hit provider's API rate limit
Solutions:
Immediate:
Short term:
Long term:
OpenAI: https://platform.openai.com/account/usage/overview
Anthropic: https://console.anthropic.com/account/billing/overview
Google: Google Cloud Console
ollama pull mistral for best modelSymptom: Error about too many tokens
Cause: Sources too large for model
Solutions:
Current: GPT-4o (128K tokens) → Switch to: Claude (200K tokens)
Current: Claude Haiku (200K) → Switch to: Gemini (1M tokens)
To change: Settings → Models
1. Select fewer sources
2. Or use "Summary Only" instead of "Full Content"
3. Or split large documents into smaller pieces
# Use smaller model:
ollama pull phi # Very small
# Instead of: ollama pull neural-chat # Large
Symptom: Generic API error, response times out
Cause: Provider API down, network issue, or slow service
Solutions:
OpenAI: https://status.openai.com/
Anthropic: Check website
Google: Google Cloud Status
Groq: Check website
1. Wait 30 seconds
2. Try again
1. Settings → Models
2. Try different provider
3. If OpenAI down, use Anthropic
# Verify internet working:
ping google.com
# Test API endpoint directly:
curl https://api.openai.com/v1/models \
-H "Authorization: Bearer YOUR_KEY"
Symptom: AI makes up facts that aren't in sources
Cause: Sources not in context, or model guessing
Solutions:
1. Click citation in response
2. Check source actually says that
3. If not, sources weren't in context
4. Add source to context and try again
Ask: "Answer this with citations to specific pages"
The AI will be more careful if asked for citations
Weaker models hallucinate more
Switch to: GPT-4o or Claude Sonnet
Symptom: API bills are higher than expected
Cause: Using expensive model, large context, many requests
Solutions:
Expensive: gpt-4o
Cheaper: gpt-4o-mini (10x cheaper)
Expensive: Claude Sonnet
Cheaper: Claude Haiku (5x cheaper)
Groq: Ultra cheap but fewer models
In Chat:
1. Select fewer sources
2. Use "Summary Only" for background
3. Ask more specific questions
# Install Ollama
# Run: ollama serve
# Download: ollama pull mistral
# Set: OLLAMA_API_BASE=http://localhost:11434
# Cost: Free!
docker compose logs api | grep -i "error"