packages/kilo-docs/pages/customize/context/codebase-indexing.md
Codebase Indexing enables semantic code search across your entire project using AI embeddings. Instead of searching for exact text matches, it understands the meaning of your queries, helping Kilo Code find relevant code even when you don't know specific function names or file locations.
{% callout type="info" title="Opt-in indexing" %} Codebase Indexing is disabled by default. It starts only after you enable indexing globally or for an individual project. Configuring an embedding provider without enabling one of those toggles does not start indexing. {% /callout %}
When enabled, the indexing system:
semantic_search tool to Kilo Code for intelligent code discoveryThis enables natural language queries like "user authentication logic" or "database connection handling" to find relevant code across your entire project.
{% tabs %} {% tab label="VSCode" %}
Qdrant or LanceDB) and configure it.You can also edit the indexing section in kilo.jsonc directly:
{
"indexing": {
"enabled": true,
"provider": "openai",
"model": "text-embedding-3-small",
"vectorStore": "lancedb",
"openai": { "apiKey": "sk-..." },
"lancedb": {}
}
}
| Provider | How to use | Notes |
|---|---|---|
| OpenAI | API key | Default model: text-embedding-3-small. text-embedding-3-large for higher accuracy. |
| Ollama | Local base URL | No API costs. Runs fully offline. |
| OpenAI-Compatible | Base URL + API key | For self-hosted or third-party OpenAI-compatible endpoints. |
| Gemini | Google AI API key | Supports gemini-embedding-001 and other Gemini embedding models. |
| Mistral | API key from La Plateforme | Use a standard Mistral API key. The Codestral-specific keys from the Mistral autocomplete setup guide are not interchangeable — those only work for completion. |
| Vercel AI Gateway | API key | Routes requests through Vercel AI Gateway. |
| AWS Bedrock | AWS region + profile | Uses the AWS SDK credential chain. |
| OpenRouter | API key (optional specific provider) | Routes through OpenRouter. |
| Voyage | API key | Voyage voyage-code-3 is tuned for code. |
{% callout type="tip" %} For a fully local, zero-cost setup, combine Ollama (embeddings) with LanceDB (vector store — no separate server needed). {% /callout %}
The prompt input panel shows a compact indexing status indicator that reflects the current state (Standby / In Progress / Complete / Error) along with progress when scanning or embedding.
{% /tab %} {% tab label="CLI" %}
The /indexing command (and aliases /index, /embedding) is available when the indexing plugin is installed. Indexing remains disabled until it is enabled globally or for the current project.
Open a Kilo TUI session and run:
/indexing
(aliases: /index, /embedding)
This opens an interactive configuration dialog where you can:
Qdrant or LanceDB) and configure its connectionAll changes are written to your kilo.jsonc config and take effect immediately.
You can also edit the indexing section directly. This is the full shape of the section:
{
"indexing": {
"enabled": true,
"provider": "voyage",
"model": "voyage-code-3",
"dimension": 1024,
"vectorStore": "qdrant",
"voyage": {
"apiKey": "pa-..."
},
"qdrant": {
"url": "http://localhost:6333",
"apiKey": ""
},
"searchMinScore": 0.4,
"searchMaxResults": 50,
"embeddingBatchSize": 60,
"scannerMaxBatchRetries": 3
}
}
| Provider | Config key | Settings | Notes |
|---|---|---|---|
| OpenAI | openai | { apiKey } | Default: text-embedding-3-small. |
| Ollama | ollama | { baseUrl } | No API costs. Runs fully offline. |
| OpenAI-Compatible | openai-compatible | { baseUrl, apiKey } | For self-hosted or third-party endpoints. |
| Gemini | gemini | { apiKey } | Supports gemini-embedding-001. |
| Mistral | mistral | { apiKey } | Use a La Plateforme key — the Codestral-specific keys from the autocomplete setup guide don't work for embeddings. |
| Vercel AI Gateway | vercel-ai-gateway | { apiKey } | Routes through Vercel AI Gateway. |
| AWS Bedrock | bedrock | { region, profile } | Uses AWS SDK credential chain. |
| OpenRouter | openrouter | { apiKey, specificProvider? } | Routes through OpenRouter. |
| Voyage | voyage | { apiKey } | voyage-code-3 is tuned for code. |
qdrant — { url?, apiKey? } (default). See Setting Up Qdrant.lancedb — { directory? } — embedded, file-based. No server to run. Uses a default Kilo data directory when omitted.{% callout type="tip" %} For a fully local, zero-cost setup, combine Ollama (embeddings) with LanceDB (vector store — no separate server needed). {% /callout %}
When indexing is enabled, the CLI shows an indexing status badge at the bottom of the TUI in the form IDX <state> (for example IDX In Progress 40% 120/300, IDX Complete, IDX Standby, or IDX Error <message>).
{% /tab %} {% tab label="VSCode (Legacy)" %}
The legacy extension uses its own Codebase Indexing settings panel.
{% image src="/docs/img/codebase-indexing/codebase-indexing.png" alt="Codebase Indexing Settings" width="800" caption="Codebase Indexing Settings (legacy)" /%}
The legacy extension supports a smaller set of providers:
| Provider | How to use | Notes |
|---|---|---|
| OpenAI | API key | Default: text-embedding-3-small. |
| Gemini | Google AI API key | Supports Gemini embedding models including gemini-embedding-001. |
| Ollama (local) | Local base URL | No API costs. |
The legacy extension only supports Qdrant. See Setting Up Qdrant.
{% /tab %} {% /tabs %}
If you choose Qdrant as your vector store, you need a running Qdrant server.
Using Docker:
docker run -p 6333:6333 qdrant/qdrant
Using Docker Compose:
version: "3.8"
services:
qdrant:
image: qdrant/qdrant
ports:
- "6333:6333"
volumes:
- qdrant_storage:/qdrant/storage
volumes:
qdrant_storage:
For team or production use:
The interface shows real-time status:
processed/total count)The indexer automatically excludes:
.git folders)node_modules, vendor, etc.).gitignore and .kilocodeignore patternsThese advanced settings live under the indexing key and are exposed in the CLI's /indexing → Tuning Parameters menu and the VS Code extension's Indexing settings:
| Setting | Default | Description |
|---|---|---|
searchMinScore | 0.4 | Minimum cosine similarity (0-1) for a result to be returned. |
searchMaxResults | 50 | Maximum number of results returned per search. |
embeddingBatchSize | 60 | Number of code segments per embedding batch. Lower this if your embedding endpoint has strict rate limits. |
scannerMaxBatchRetries | 3 | Maximum retry attempts for a failed embedding batch. |
OpenAI:
text-embedding-3-small: Best balance of performance and costtext-embedding-3-large: Higher accuracy, 5x more expensivetext-embedding-ada-002: Legacy model, lower costOllama:
mxbai-embed-large: The largest and highest-quality embedding modelnomic-embed-text: Best balance of performance and embedding qualityall-minilm: Compact model with lower quality but faster performanceVoyage:
voyage-code-3: Code-tuned embeddings; strong default for source-heavy reposkilo.jsonc config. Treat that file as a secret in shared environments.If your local embedding server is based on llama.cpp (including Ollama), indexing can fail with errors about n_ubatch or GGML_ASSERT. Ensure both batch size (-b) and micro-batch size (-ub) are set to the same value for embedding models, then restart the server. For Ollama, configure num_batch in your Modelfile or request options to match the same effective value.
indexing.enabled is true in your kilo.jsoncLower embeddingBatchSize under indexing (default 60). Smaller batches send fewer segments per request and are less likely to hit per-request or per-minute rate limits.
Once indexed, Kilo Code can use the semantic_search tool to find relevant code:
Example Queries:
The tool provides Kilo Code with:
searchMaxResults)Tune result volume and quality via:
searchMaxResults — default 50. Lower for faster, more focused responses; higher for more context.searchMinScore — default 0.4. Raise to require closer matches; lower to include more tangentially related code.kilo.jsonc configuration.kilocodeignore