doc/user/gitlab_duo/semantic_code_search.md
{{< details >}}
{{< /details >}}
{{< history >}}
{{< /history >}}
Semantic code search uses AI to find relevant code snippets in your repository based on meaning rather than keyword matching.
Semantic code search converts your codebase into vector embeddings and stores these embeddings in a vector database. Your search query is also converted into an embedding and then compared against your code embeddings to find the most semantically similar results. This approach finds relevant code even when keywords do not match.
Improvements to this feature are proposed in epic 18018 and epic 20110.
pgvector extension.If your GitLab instance uses Elasticsearch or OpenSearch for advanced search, you can enable semantic code search by connecting to the same cluster:
To create a custom vector store connection for Elasticsearch, OpenSearch, or PostgreSQL,
in the Rails console, create a connection with adapter and options.
[!note] You should use Elasticsearch or OpenSearch for medium to large repositories. Use PostgreSQL with
pgvectoronly for setups with a few small repositories. Indexing and querying performance might be limited withpgvector.
connection = Ai::ActiveContext::Connection.create!(
name: "elasticsearch",
options: options,
adapter_class: "ActiveContext::Databases::Elasticsearch::Adapter"
)
connection.activate!
Connection options:
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
url | array of strings | Yes | None | Array of URLs for your Elasticsearch cluster (for example, ["http://localhost:9200"]). |
client_adapter | string | No | typhoeus | HTTP adapter to use. Possible values are typhoeus and net_http. |
client_request_timeout | integer | No | 30 | Request timeout in seconds. |
retry_on_failure | integer | No | 0 | Number of retries on failure. |
debug | boolean | No | false | Enables debug logging. |
connection = Ai::ActiveContext::Connection.create!(
name: "opensearch",
options: options,
adapter_class: "ActiveContext::Databases::Opensearch::Adapter"
)
connection.activate!
Connection options:
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
url | array of strings | Yes | None | Array of URLs for your OpenSearch cluster (for example, ["http://localhost:9200"]). |
client_adapter | string | No | typhoeus | HTTP adapter to use. Possible values are typhoeus and net_http. |
client_request_timeout | integer | No | 30 | Request timeout in seconds. |
retry_on_failure | integer | No | 0 | Number of retries on failure. |
debug | boolean | No | false | Enables debug logging. |
aws | boolean | No | false | Enables AWS Signature Version 4 signing. |
aws_region | string | No | None | AWS region for your OpenSearch domain. |
aws_access_key | string | No | None | AWS access key ID. |
aws_secret_access_key | string | No | None | AWS secret access key. |
pgvector{{< history >}}
{{< /history >}}
For PostgreSQL, use the pgvector extension:
In the PostgreSQL database, create the extension:
CREATE EXTENSION vector;
In the Rails console, create the connection:
connection = Ai::ActiveContext::Connection.create!(
name: "postgres",
options: options,
adapter_class: "ActiveContext::Databases::Postgresql::Adapter"
)
connection.activate!
Connection options:
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
host | string | Yes | None | PostgreSQL host. |
port | integer | No | None | PostgreSQL port. |
database | string | No | None | Database name. |
user | string | No | None | PostgreSQL user. |
password | string | No | None | PostgreSQL password. |
connect_timeout | integer | No | 5 | Connection timeout in seconds. |
pool_size | integer | No | 5 | Connection pool size. |
Semantic code search is available as a GitLab MCP server tool.
For more information about how to use this tool, see
semantic_code_search.
When you first use semantic code search in a GitLab project:
Initial indexing might take a while depending on your repository size.