docs/Embedding Model Switching.md
When using different embedding models (like switching from OpenAI to Ollama models), you may encounter dimension mismatch issues. This happens because different models produce embedding vectors with different dimensions:
| Model | Dimension |
|---|---|
| OpenAI text-embedding-ada-002 | 1536 |
| OpenAI text-embedding-3-small | 1536 |
| OpenAI text-embedding-3-large | 3072 |
| Ollama snowflake-arctic-embed | 768 |
| Ollama nomic-embed-text | 768 |
| Ollama mxbai-embed-large | 1024 |
Second Me now includes automatic detection and handling of embedding dimension mismatches. When you switch between embedding models with different dimensions, the system will:
When switching between embedding models with different dimensions, follow these steps:
data/chroma_db directoryThe system now automatically handles dimension mismatches when switching between embedding models. You'll see log messages like:
Warning: Existing 'documents' collection has dimension X, but current model requires Y
Automatically reinitializing ChromaDB collections with the new dimension...
Successfully reinitialized ChromaDB collections with the new dimension
This indicates that the system has detected and resolved a dimension mismatch automatically. If you still encounter issues after the automatic handling:
data/chroma_db directoryThe dimension mismatch handling is implemented in:
lpm_kernel/file_data/chroma_utils.py: Contains utilities for detecting model dimensions and reinitializing collectionslpm_kernel/file_data/embedding_service.py: Handles dimension checking during initializationdocker/app/init_chroma.py: Performs dimension validation during initial setupThe system maintains a mapping of known embedding models to their dimensions and will default to 1536 (OpenAI's dimension) for unknown models.