docs/5-CONFIGURATION/ollama.md
Ollama provides free, local AI models that run on your own hardware. This guide covers everything you need to know about setting up Ollama with Open Notebook, including different deployment scenarios and network configurations.
Linux/macOS:
curl -fsSL https://ollama.ai/install.sh | sh
Windows: Download and install from ollama.ai
# Language models (choose one or more)
ollama pull qwen3 # Excellent general purpose, 7B parameters
ollama pull gemma3 # Google's model, good performance
ollama pull deepseek-r1 # Advanced reasoning model
ollama pull phi4 # Microsoft's efficient model
# Embedding model (required for search)
ollama pull mxbai-embed-large # Best embedding model for Ollama
Via Settings UI (Recommended):
Legacy (Deprecated) — Environment variables:
# For local installation:
export OLLAMA_API_BASE=http://localhost:11434
# For Docker installation:
export OLLAMA_API_BASE=http://host.docker.internal:11434
Note: The
OLLAMA_API_BASEenvironment variable is deprecated. Configure Ollama via Settings → API Keys instead.
When adding an Ollama credential in Settings → API Keys, you need to enter the correct base URL. The correct URL depends on your deployment scenario:
When both Open Notebook and Ollama run directly on your machine:
Base URL to enter in Settings → API Keys: http://localhost:11434
Alternative: http://127.0.0.1:11434 (use if you have DNS resolution issues with localhost)
When Open Notebook runs in Docker but Ollama runs on your host machine:
Base URL to enter in Settings → API Keys: http://host.docker.internal:11434
⚠️ CRITICAL: Ollama must accept external connections:
# Start Ollama with external access enabled
export OLLAMA_HOST=0.0.0.0:11434
ollama serve
⚠️ LINUX USERS: Extra configuration required!
On Linux, host.docker.internal doesn't resolve automatically like it does on macOS/Windows. You must add extra_hosts to your docker-compose.yml:
# Add to your docker-compose.yml (requires surrealdb service, see installation guide)
services:
open_notebook:
image: lfnovo/open_notebook:v1-latest
# ... other settings ...
extra_hosts:
- "host.docker.internal:host-gateway"
Without this, you'll get connection errors like:
httpcore.ConnectError: [Errno -2] Name or service not known
Why host.docker.internal?
localhost on the hosthost.docker.internal is Docker's special hostname for the host machineextra_hosts on LinuxWhy OLLAMA_HOST=0.0.0.0:11434?
OLLAMA_HOST=0.0.0.0:11434 allows connections from Docker containersWhen both Open Notebook and Ollama run in the same Docker Compose stack:
Base URL to enter in Settings → API Keys: http://ollama:11434
Docker Compose Example:
# Requires surrealdb service — see full base setup:
# https://github.com/lfnovo/open-notebook/blob/main/docker-compose.yml
services:
open-notebook:
image: lfnovo/open_notebook:v1-latest
pull_policy: always
ports:
- "8502:8502"
- "5055:5055"
environment:
- OPEN_NOTEBOOK_ENCRYPTION_KEY=change-me-to-a-secret-string
volumes:
- ./notebook_data:/app/data
depends_on:
- ollama
ollama:
image: ollama/ollama:latest
ports:
- "11434:11434"
volumes:
- ollama_data:/root/.ollama
# Optional: GPU support
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
volumes:
ollama_data:
When Ollama runs on a different machine in your network:
Base URL to enter in Settings → API Keys: http://192.168.1.100:11434 (replace with your Ollama server's IP)
Security Note: Only use this in trusted networks. Ollama doesn't have built-in authentication.
If you've configured Ollama to use a different port:
# Start Ollama on custom port
OLLAMA_HOST=0.0.0.0:8080 ollama serve
Base URL to enter in Settings → API Keys: http://localhost:8080
| Model | Size | Best For | Quality | Speed |
|---|---|---|---|---|
| qwen3 | 7B | General purpose, coding | Excellent | Fast |
| deepseek-r1 | 7B | Reasoning, problem-solving | Exceptional | Medium |
| gemma3 | 7B | Balanced performance | Very Good | Fast |
| phi4 | 14B | Efficiency on small hardware | Good | Very Fast |
| llama3 | 8B | General purpose | Very Good | Medium |
| Model | Best For | Performance |
|---|---|---|
| mxbai-embed-large | General search | Excellent |
| nomic-embed-text | Document similarity | Good |
| all-minilm | Lightweight option | Fair |
# Essential models
ollama pull qwen3 # Primary language model
ollama pull mxbai-embed-large # Search embeddings
# Optional reasoning model
ollama pull deepseek-r1 # Advanced reasoning
# Alternative language models
ollama pull gemma3 # Google's model
ollama pull phi4 # Microsoft's efficient model
NVIDIA GPU (CUDA):
# Install NVIDIA Container Toolkit for Docker
# Then use the Docker Compose example above with GPU support
# For local installation, Ollama auto-detects CUDA
ollama pull qwen3
Apple Silicon (M1/M2/M3):
# Ollama automatically uses Metal acceleration
# No additional setup required
ollama pull qwen3
AMD GPUs:
# ROCm support varies by model and system
# Check Ollama documentation for latest compatibility
⚠️ IMPORTANT: Model names must exactly match the output of ollama list
This is the most common cause of "Failed to send message" errors. Open Notebook requires the exact model name as it appears in Ollama.
Step 1: Get the exact model name
ollama list
Example output:
NAME ID SIZE MODIFIED
mxbai-embed-large:latest 468836162de7 669 MB 7 minutes ago
gemma3:12b f4031aab637d 8.1 GB 2 months ago
qwen3:32b 030ee887880f 20 GB 9 days ago
Step 2: Use the exact name when adding the model in Open Notebook
| ✅ Correct | ❌ Wrong |
|---|---|
gemma3:12b | gemma3 (missing tag) |
qwen3:32b | qwen3-32b (wrong format) |
mxbai-embed-large:latest | mxbai-embed-large (missing tag) |
Note: Some models use :latest as the default tag. If ollama list shows model:latest, you must use model:latest in Open Notebook, not just model.
Step 3: Configure in Open Notebook
ollama listollamalanguage (for chat) or embedding (for search)1. "Ollama unavailable" in Open Notebook
Check Ollama is running:
curl http://localhost:11434/api/tags
Verify credential is configured: Check Settings → API Keys for an Ollama credential with the correct base URL.
⚠️ IMPORTANT: Enable external connections (most common fix):
# If Open Notebook runs in Docker or on a different machine,
# Ollama must bind to all interfaces, not just localhost
export OLLAMA_HOST=0.0.0.0:11434
ollama serve
Why this is needed: By default, Ollama only accepts connections from
localhost(127.0.0.1). When Open Notebook runs in Docker or on a different machine, it can't reach Ollama unless you configureOLLAMA_HOST=0.0.0.0:11434to accept external connections.
Restart Ollama:
# Linux/macOS
sudo systemctl restart ollama
# or
ollama serve
# Windows
# Restart from system tray or Services
2. Docker networking issues
From inside Open Notebook container, test Ollama:
# Get into container
docker exec -it open-notebook bash
# Test connection
curl http://host.docker.internal:11434/api/tags
If this fails on Linux with "Name or service not known", you need to add extra_hosts to your docker-compose.yml. See the Docker-Specific Troubleshooting section below.
3. Models not downloading
Check disk space:
df -h
Manual model pull:
ollama pull qwen3 --verbose
Clear failed downloads:
ollama rm qwen3
ollama pull qwen3
4. Slow performance
Check model size vs available RAM:
ollama ps # Show running models
free -h # Check available memory
Use smaller models:
ollama pull phi4 # Instead of larger models
ollama pull gemma3:2b # 2B parameter variant
5. Port conflicts
Check what's using port 11434:
lsof -i :11434
netstat -tulpn | grep 11434
Use custom port:
OLLAMA_HOST=0.0.0.0:8080 ollama serve
Then update the base URL in Settings → API Keys to http://localhost:8080
6. "Failed to send message" in Chat
Symptom: Chat shows "Failed to send message" toast notification. Logs may show:
Error executing chat: Model is not a LanguageModel: None
Causes (in order of likelihood):
ollama listSolutions:
Check 1: Verify model names match exactly
# Get exact model names from Ollama
ollama list
# Compare with what's configured in Open Notebook
# Go to Settings → Models and verify the names match EXACTLY
Check 2: Verify default models are set
Check 3: Refresh after changes If you've added/removed models in Ollama:
ollama listCheck 4: Test the model directly
# Verify Ollama can use the model
ollama run gemma3:12b "Hello, world"
1. Linux: host.docker.internal not resolving (Most Common)
If you see Name or service not known errors on Linux, add extra_hosts to your docker-compose.yml:
# Add to your docker-compose.yml (requires surrealdb service, see installation guide)
services:
open_notebook:
image: lfnovo/open_notebook:v1-latest
extra_hosts:
- "host.docker.internal:host-gateway"
environment:
# ... rest of your config
Then in Settings → API Keys, use base URL: http://host.docker.internal:11434
This maps host.docker.internal to your host machine's IP. macOS/Windows Docker Desktop does this automatically, but Linux requires explicit configuration.
2. Host networking on Linux (alternative):
# Use host networking if host.docker.internal doesn't work
docker run --network host lfnovo/open_notebook:v1-latest # for quick testing only
Then in Settings → API Keys, use base URL: http://localhost:11434
3. Custom bridge network:
version: '3.8'
networks:
ollama_network:
driver: bridge
services:
open-notebook:
networks:
- ollama_network
environment:
ollama:
networks:
- ollama_network
Then in Settings → API Keys, use base URL: http://ollama:11434
4. Firewall issues:
# Allow Ollama port through firewall
sudo ufw allow 11434
# or
sudo firewall-cmd --add-port=11434/tcp --permanent
List installed models:
ollama list
Remove unused models:
ollama rm model_name
Show running models:
ollama ps
Preload models for faster startup:
# Keep model in memory
curl http://localhost:11434/api/generate -d '{
"model": "qwen3",
"prompt": "test",
"keep_alive": -1
}'
Linux: Increase file limits:
echo "* soft nofile 65536" >> /etc/security/limits.conf
echo "* hard nofile 65536" >> /etc/security/limits.conf
macOS: Increase memory limits:
# Add to ~/.zshrc or ~/.bash_profile
export OLLAMA_MAX_LOADED_MODELS=2
export OLLAMA_NUM_PARALLEL=4
Docker: Resource allocation:
services:
ollama:
deploy:
resources:
limits:
memory: 8G
cpus: '4'
# Ollama server configuration
export OLLAMA_HOST=0.0.0.0:11434 # Bind to all interfaces
export OLLAMA_KEEP_ALIVE=5m # Keep models in memory
export OLLAMA_MAX_LOADED_MODELS=3 # Max concurrent models
export OLLAMA_MAX_QUEUE=512 # Request queue size
export OLLAMA_NUM_PARALLEL=4 # Parallel request handling
export OLLAMA_FLASH_ATTENTION=1 # Enable flash attention (if supported)
# Open Notebook configuration (configure via Settings → API Keys instead)
# OLLAMA_API_BASE=http://localhost:11434 # Deprecated — use Settings UI
If you're running Ollama behind a reverse proxy with self-signed SSL certificates (e.g., Caddy, nginx with custom certs), you may encounter SSL verification errors:
[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate
Solutions:
Option 1: Use a custom CA bundle (recommended)
# Point to your CA certificate file
export ESPERANTO_SSL_CA_BUNDLE=/path/to/your/ca-bundle.pem
Option 2: Disable SSL verification (development only)
# WARNING: Only use in trusted development environments
export ESPERANTO_SSL_VERIFY=false
Docker Compose example with SSL configuration:
# Add to your docker-compose.yml (requires surrealdb service, see installation guide)
services:
open-notebook:
image: lfnovo/open_notebook:v1-latest
pull_policy: always
environment:
- OPEN_NOTEBOOK_ENCRYPTION_KEY=change-me-to-a-secret-string
# Option 1: Custom CA bundle (if Ollama uses self-signed SSL)
- ESPERANTO_SSL_CA_BUNDLE=/certs/ca-bundle.pem
# Option 2: Disable verification (dev only)
# - ESPERANTO_SSL_VERIFY=false
volumes:
- /path/to/your/ca-bundle.pem:/certs/ca-bundle.pem:ro
Security Note: Disabling SSL verification exposes you to man-in-the-middle attacks. Always prefer using a custom CA bundle in production environments.
Import custom models:
# Create Modelfile
cat > Modelfile << EOF
FROM qwen3
PARAMETER temperature 0.7
PARAMETER top_p 0.9
SYSTEM "You are a helpful research assistant."
EOF
# Create custom model
ollama create my-research-model -f Modelfile
Use in Open Notebook:
my-research-modelMonitor Ollama logs:
# Linux (systemd)
journalctl -u ollama -f
# Docker
docker logs -f ollama
# Manual run with verbose logging
OLLAMA_DEBUG=1 ollama serve
Resource monitoring:
# CPU and memory usage
htop
# GPU usage (NVIDIA)
nvidia-smi -l 1
# Model-specific metrics
ollama ps
import requests
import os
# Test Ollama connection
ollama_base = os.environ.get('OLLAMA_API_BASE', 'http://localhost:11434')
response = requests.get(f'{ollama_base}/api/tags')
print(f"Available models: {response.json()}")
# Generate text
payload = {
"model": "qwen3",
"prompt": "Explain quantum computing",
"stream": False
}
response = requests.post(f'{ollama_base}/api/generate', json=payload)
print(response.json()['response'])
#!/bin/bash
# ollama-health-check.sh
OLLAMA_API_BASE=${OLLAMA_API_BASE:-"http://localhost:11434"}
echo "Checking Ollama health..."
if curl -s "${OLLAMA_API_BASE}/api/tags" > /dev/null; then
echo "✅ Ollama is running"
echo "Available models:"
curl -s "${OLLAMA_API_BASE}/api/tags" | jq -r '.models[].name'
else
echo "❌ Ollama is not accessible at ${OLLAMA_API_BASE}"
exit 1
fi
Similar performance models:
qwen3 or deepseek-r1gemma3 or phi4mxbai-embed-largeCost comparison:
Claude replacement suggestions:
deepseek-r1 (reasoning)phi4 (speed)Network Security:
Model Verification:
Resource Limits:
Model Selection:
Resource Management:
Network Optimization:
Community Resources:
Debugging Resources:
This comprehensive guide should help you successfully deploy and optimize Ollama with Open Notebook. Start with the Quick Start section and refer to specific scenarios as needed.