Advanced search development guidelines

This page includes information about developing and working with Advanced search, which is powered by Elasticsearch.

Information on how to enable Advanced search and perform the initial indexing is in the Elasticsearch integration documentation.

Deep dive resources

These recordings and presentations provide in-depth knowledge about the Advanced search implementation:

Date	Topic	Presenter	Resources	GitLab Version
July 2024	Advanced search basics, integration, indexing, and search	Terri Chu	<i class="fa-youtube-play" aria-hidden="true"></i>Recording on YouTube (GitLab team members only)
Google slides (GitLab team members only)	GitLab 17.0
June 2021	GitLabs data migration process for Advanced search	Dmitry Gruzd	Blog post	GitLab 13.12
August 2020	GitLab-specific architecture for multi-indices support	Mark Chao	Recording on YouTube
Google slides	GitLab 13.3
June 2019	GitLab Elasticsearch integration	Mario de la Ossa	<i class="fa-youtube-play" aria-hidden="true"></i>Recording on YouTube
Google slides
PDF	GitLab 12.0

Elasticsearch configuration

Supported versions

See Version Requirements.

Developers making significant changes to Elasticsearch queries should test their features against all our supported versions.

Setting up your development environment

See the Elasticsearch GDK setup instructions
Ensure Elasticsearch is running:
shell
```
curl "http://localhost:9200"
```

Run Kibana to interact with your local Elasticsearch cluster. Alternatively, you can use Cerebro or a similar tool.

To tail the logs for Elasticsearch, run this command:
shell
```
tail -f log/elasticsearch.log
```
If you run in SaaS mode, you should limit the amount of namespace and project data to index to mimic how advanced search is configured on GitLab.com.

If namespace limiting is not enabled, advanced search is enabled by default for all namespaces (including free namespaces).

Helpful Rake tasks

gitlab:elastic:test:index_size: Tells you how much space the current index is using, as well as how many documents are in the index.
gitlab:elastic:test:index_size_change: Outputs index size, reindexes, and outputs index size again. Useful when testing improvements to indexing size.

Additionally, if you need large repositories or multiple forks for testing, consider following these instructions

Development workflow

Development tips

Debugging & troubleshooting

Debugging Elasticsearch queries

The ELASTIC_CLIENT_DEBUG environment variable enables the debug option for the Elasticsearch client in development or test environments. If you need to debug Elasticsearch HTTP queries generated from code or tests, it can be enabled before running specs or starting the Rails console:

console

ELASTIC_CLIENT_DEBUG=1 bundle exec rspec ee/spec/workers/search/elastic/trigger_indexing_worker_spec.rb

export ELASTIC_CLIENT_DEBUG=1
rails console

Getting `flood stage disk watermark [95%] exceeded`

You might get an error such as

plaintext

[2018-10-31T15:54:19,762][WARN ][o.e.c.r.a.DiskThresholdMonitor] [pval5Ct]
   flood stage disk watermark [95%] exceeded on
   [pval5Ct7SieH90t5MykM5w][pval5Ct][/usr/local/var/lib/elasticsearch/nodes/0] free: 56.2gb[3%],
   all indices on this node will be marked read-only

This is because you've exceeded the disk space threshold - it thinks you don't have enough disk space left, based on the default 95% threshold.

In addition, the read_only_allow_delete setting will be set to true. It will block indexing, forcemerge, etc

shell

curl "http://localhost:9200/gitlab-development/_settings?pretty"

Add this to your elasticsearch.yml file:

yaml

# turn off the disk allocator
cluster.routing.allocation.disk.threshold_enabled: false

yaml

# set your own limits
cluster.routing.allocation.disk.threshold_enabled: true
cluster.routing.allocation.disk.watermark.flood_stage: 5gb   # ES 6.x only
cluster.routing.allocation.disk.watermark.low: 15gb
cluster.routing.allocation.disk.watermark.high: 10gb

Restart Elasticsearch, and the read_only_allow_delete will clear on its own.

from "Disk-based Shard Allocation | Elasticsearch Reference" 5.6 and 6.x

Performance monitoring

Prometheus

GitLab exports Prometheus metrics relating to the number of requests and timing for all web/API requests and Sidekiq jobs, which can help diagnose performance trends and compare how Elasticsearch timing is impacting overall performance relative to the time spent doing other things.

Indexing queues

GitLab also exports Prometheus metrics for indexing queues, which can help diagnose performance bottlenecks and determine whether your GitLab instance or Elasticsearch server can keep up with the volume of updates.

Logs

All indexing happens in Sidekiq, so much of the relevant logs for the Elasticsearch integration can be found in sidekiq.log. In particular, all Sidekiq workers that make requests to Elasticsearch in any way will log the number of requests and time taken querying/writing to Elasticsearch. This can be useful to understand whether or not your cluster is keeping up with indexing.

Searching Elasticsearch is done via ordinary web workers handling requests. Any requests to load a page or make an API request, which then make requests to Elasticsearch, will log the number of requests and the time taken to production_json.log. These logs will also include the time spent on Database and Gitaly requests, which may help to diagnose which part of the search is performing poorly.

There are additional logs specific to Elasticsearch that are sent to elasticsearch.log that may contain information to help diagnose performance issues.

Performance Bar

Elasticsearch requests will be displayed in the Performance Bar, which can be used both locally in development and on any deployed GitLab instance to diagnose poor search performance. This will show the exact queries being made, which is useful to diagnose why a search might be slow.

Correlation ID and `X-Opaque-Id`

Our correlation ID is forwarded by all requests from Rails to Elasticsearch as the X-Opaque-Id header which allows us to track any tasks in the cluster back the request in GitLab.

Architecture

The framework used to communicate to Elasticsearch is in the process of a refactor tracked in this epic.

Indexing Overview

Advanced search selectively indexes data. Each data type follows a specific indexing pipeline:

Data type	How is it queued	Where is it queued	Where does indexing occur
Database records	Record changes through ActiveRecord callbacks and `Gitlab::EventStore`	Redis ZSET	`ElasticIndexInitialBulkCronWorker`, `ElasticIndexBulkCronWorker`
Git repository data	Branch push service and default branch change worker	Sidekiq	`Search::Elastic::CommitIndexerWorker`, `ElasticWikiIndexerWorker`

Indexing Components

External Indexer

For repository content, GitLab uses a dedicated indexer written in Go to efficiently process files.

Rails Indexing Lifecycle

Initial Indexing: Administrators trigger the first complete index via the Admin UI or a Rake task
Ongoing Updates: After initial setup, GitLab maintains index currency through:
- Model callbacks (after_create, after_update, after_destroy) defined in /ee/app/models/concerns/elastic/application_versioned_search.rb
- A Redis ZSET that tracks all pending changes
- Scheduled Sidekiq workers that process these queues in batches using Elasticsearch's Bulk Request API

Search and Security

The query builder framework generates search queries and handles access control logic. This portion of the codebase requires particular attention during development and code review, as it has historically been a source of security vulnerabilities.

The final step in returning search results is to redact unauthorized results for the current user to catch problems with the queries or race conditions.

Migration framework

GitLabs Advanced search includes a robust migration framework that streamlines index maintenance and updates. This system provides significant benefits:

Selective Reindexing: Only updates specific document types when needed, avoiding full re-indexes
Automated Maintenance: Updates proceed without requiring human intervention
Consistent Experience: Provides the same migration path for both GitLab.com and GitLab Self-Managed instances

Framework Components

The migration system consists of:

Migration Runner: A cron worker that executes every 5 minutes to check for and process pending migrations.
Migration Files: Similar to database migrations, these Ruby files define the migration steps with accompanying YAML documentation
Migration Status Tracking: All migration states are stored in a dedicated Elasticsearch index
Migration Lifecycle States: Each migration progresses through stages: pending → in progress → complete (or halted if issues arise)

Configuration Options

Migrations can be fine-tuned with various parameters:

Batching: Control the document batch size for optimal performance
Throttling: Adjust indexing speed to balance between migration speed and system load
Space Requirements: Verify sufficient disk space before migrations begin to prevent interruptions
Skip condition: Define a condition for skipping the migration

This framework makes index schema changes, field updates, and data migrations reliable and unobtrusive for all GitLab installations.

Search DSL

This section covers the Search DSL (Domain Specific Language) supported by GitLab, which is compatible with both Elasticsearch and OpenSearch implementations.

Custom routing

Custom routing is used in Elasticsearch for document types. The routing format is usually project_<project_id> for project associated data and group_<root_namespace_id> for group associated data. Routing is set during indexing and searching operations and tells Elasticsearch what shards to put the data into. Some of the benefits and tradeoffs to using custom routing are:

Project and group scoped searches are much faster since not all shards have to be hit.
Routing is not used if too many shards would be hit for global and group scoped searches.
Shard size imbalance might occur.

Existing analyzers and tokenizers

The following analyzers and tokenizers are defined in ee/lib/elastic/latest/config.rb.

Analyzers

`path_analyzer`

Used when indexing blobs' paths. Uses the path_tokenizer and the lowercase and asciifolding filters.

See the path_tokenizer explanation below for an example.

`sha_analyzer`

Used in blobs and commits. Uses the sha_tokenizer and the lowercase and asciifolding filters.

See the sha_tokenizer explanation later below for an example.

`code_analyzer`

Used when indexing a blob's filename and content. Uses the whitespace tokenizer and the word_delimiter_graph, lowercase, and asciifolding filters.

The whitespace tokenizer was selected to have more control over how tokens are split. For example the string Foo::bar(4) needs to generate tokens like Foo and bar(4) to be properly searched.

See the code filter for an explanation on how tokens are split.

Tokenizers

`sha_tokenizer`

This is a custom tokenizer that uses the edgeNGram tokenizer to allow SHAs to be searchable by any sub-set of it (minimum of 5 chars).

Example:

240c29dc7e becomes:

240c2
240c29
240c29d
240c29dc
240c29dc7
240c29dc7e

`path_tokenizer`

This is a custom tokenizer that uses the path_hierarchy tokenizer with reverse: true to allow searches to find paths no matter how much or how little of the path is given as input.

Example:

'/some/path/application.js' becomes:

'/some/path/application.js'
'some/path/application.js'
'path/application.js'
'application.js'

Common gotchas

Searches can have their own analyzers. Remember to check when editing analyzers.
Character filters (as opposed to token filters) always replace the original character. These filters can hinder exact searches.

Implementation guide

Add a new document type to Elasticsearch

If data cannot be added to one of the existing indices in Elasticsearch, follow these instructions to set up a new index and populate it.

Create the index

All new indexes must have:

project_id and namespace_id fields (if available). One of the fields must be used for custom routing.
A traversal_ids field for efficient global and group search. Populate the field with object.namespace.elastic_namespace_ancestry
Fields for authorization:
- For project data - visibility_level
- For group data - namespace_visibility_level
- Any required access level fields. These correspond to project feature access levels such as issues_access_level or repository_access_level
A schema_version integer field in a YYVV (year/version) format. YY is the two-digit year, VV is a rolling counter (01-99) within that year. The schema version must be defined in a constant (SCHEMA_VERSION) in the reference index class (Search::Elastic::References::<IndexedData> or Elastic::Latest::<IndexedData>InstanceProxy). This field is used to track which version of the document structure is indexed and enables data migrations. It must be incremented when the index mapping changes and may be incremented when field content changes.

Create a Search::Elastic::Types:: class in ee/lib/search/elastic/types/.
Define the following class methods:
- index_name: in the format gitlab-<env>-<type> (for example, gitlab-production-work_items).
- mappings: a hash containing the index schema such as fields, data types, and analyzers.
- settings: a hash containing the index settings such as replicas and tokenizers. The default is good enough for most cases.
Add a new advanced search migration to create the index by executing scripts/elastic-migration and following the instructions. The migration name must be in the format Create<Name>Index.
Use the Search::Elastic::MigrationCreateIndexHelper helper and the 'migration creates a new index' shared example for the specification file created.
Add the target class to Gitlab::Elastic::Helper::ES_SEPARATE_CLASSES.
To test the index creation, run Elastic::MigrationWorker.new.perform in a console and check that the index has been created with the correct mappings and settings:
shell
```
curl "http://localhost:9200/gitlab-development-<type>/_mappings" | jq .`
```
shell
```
curl "http://localhost:9200/gitlab-development-<type>/_settings" | jq .`
```

PostgreSQL to Elasticsearch mappings

Data types for primary and foreign keys must match the column type in the database. For example, the database column type integer maps to integer and bigint maps to long in the mapping.

[!warning] Nested fields introduce significant overhead. A flattened multi-value approach is recommended instead.

PostgreSQL type	Elasticsearch mapping
bigint	long
smallint	short
integer	integer
boolean	boolean
array	keyword
timestamp	date
character varying, text	Depends on query requirements. Use `text` for full-text search and `keyword` for term queries, sorting, or aggregations

Validate expected queries

Before creating a new index, it's crucial to validate that the planned mappings will support your expected queries. Verifying mapping compatibility upfront helps avoid issues that would require index rebuilding later.

Create a new Elastic Reference

Create a Search::Elastic::References:: class in ee/lib/search/elastic/references/.

The reference is used to perform bulk operations in Elasticsearch. The file must inherit from Search::Elastic::Reference and define the following constant and methods:

ruby

include Search::Elastic::Concerns::DatabaseReference # if there is a corresponding database record for every document

SCHEMA_VERSION = 24_46 # integer in YYVV format

override :serialize
def self.serialize(record)
   # a string representation of the reference
end

override :instantiate
def self.instantiate(string)
   # deserialize the string and call initialize
end

override :preload_indexing_data
def self.preload_indexing_data(refs)
   # remove this method if `Search::Elastic::Concerns::DatabaseReference` is included
   # otherwise return refs
end

def initialize
   # initialize with instance variables
end

override :identifier
def identifier
   # a way to identify the reference
end

override :routing
def routing
   # Optional: an identifier to route the document in Elasticsearch
end

override :operation
def operation
   # one of `:index`, `:upsert` or `:delete`
end

override :serialize
def serialize
   # a string representation of the reference
end

override :as_indexed_json
def as_indexed_json
   # a hash containing the document representation for this reference
end

override :index_name
def index_name
   # index name
end

def model_klass
   # set to the model class if `Search::Elastic::Concerns::DatabaseReference` is included
end

To add data to the index, an instance of the new reference class is called in Elastic::ProcessBookkeepingService.track!() to add the data to a queue of references for indexing. A cron worker pulls queued references and bulk-indexes the items into Elasticsearch.

To test that the indexing operation works, call Elastic::ProcessBookkeepingService.track!() with an instance of the reference class and run Elastic::ProcessBookkeepingService.new.execute. The logs show the updates. To check the document in the index, run this command:

shell

curl "http://localhost:9200/gitlab-development-<type>/_search"

Common gotchas

Index operations actually perform an upsert. If the document exists, it performs a partial update by merging fields sent with the existing document fields. If you want to explicitly remove fields or set them to empty, the as_indexed_json must send nil or an empty array.

Data consistency

Now that we have an index and a way to bulk index the new document type into Elasticsearch, we need to add data into the index. This consists of doing a backfill and doing continuous updates to ensure the index data is up to date.

The backfill is done by calling Elastic::ProcessInitialBookkeepingService.track!() with an instance of Search::Elastic::Reference for every document that should be indexed.

The continuous update is done by calling Elastic::ProcessBookkeepingService.track!() with an instance of Search::Elastic::Reference for every document that should be created/updated/deleted.

Backfilling data

Add a new advanced search migration to backfill data by executing scripts/elastic-migration and following the instructions.

Use the MigrationDatabaseBackfillHelper. The BackfillWorkItems migration can be used as an example.

To test the backfill, run Elastic::MigrationWorker.new.perform in a console a couple of times and see that the index was populated.

Tail the logs to see the progress of the migration:

shell

tail -f log/elasticsearch.log

Continuous updates

For ActiveRecord objects, the ApplicationVersionedSearch concern can be included on the model to index data based on callbacks. If that's not suitable, call Elastic::ProcessBookkeepingService.track!() with an instance of Search::Elastic::Reference whenever a document should be indexed.

Always check for Gitlab::CurrentSettings.elasticsearch_indexing? and use_elasticsearch? because some GitLab Self-Managed instances do not have Elasticsearch enabled and namespace limiting can be enabled.

Also check that the index is able to handle the index request. For example, check that the index exists if it was added in the current major release by verifying that the migration to add the index was completed: Elastic::DataMigrationService.migration_has_finished?.

Transfers and deletes

Project and group transfers and deletes must make updates to the index to avoid orphaned data. Orphaned data may occur when custom routing changes due to a transfer. Data in the old shard must be cleaned up. Elasticsearch updates for transfers are handled in the Projects::TransferService and Groups::TransferService.

Indexes that contain a project_id field must use the Search::Elastic::DeleteWorker. Indexes that contain a namespace_id field and no project_id field must use Search::ElasticGroupAssociationDeletionWorker.

Add the indexed class to excluded_classes in ElasticDeleteProjectWorker
Create a new service in the ::Search::Elastic::Delete namespace to delete documents from the index
Update the worker to use the new service

Implementing search for a new document type

Search data is available in SearchController and Search API. Both use the SearchService to return results. The SearchService can be used to return results outside the SearchController and Search API.

Recommended process for implementing search for a new document type

Create the following MRs and have them reviewed by a member of the Global Search team:

Enable the new scope.
Create a query builder.
Implement all model requirements.
Add the new scope to Gitlab::Elastic::SearchResults behind a feature flag.
Add support for the scope in Search::API (if applicable)
Add specs which must include permissions tests
Test the new scope
Update documentation for Advanced search, Search API and, Roles and permissions (if applicable)

Search scopes

The SearchService exposes searching at global, group, and project levels.

New scopes must be added to the following constants:

ALLOWED_SCOPES (or override allowed_scopes method) in each EE SearchService file
ALLOWED_SCOPES in Gitlab::Search::AbuseDetection
search_tab_ability_map method in Search::Navigation. Override in the EE version if needed

[!note] Global search can be disabled for a scope. You can do the following changes for disabling global search:

Add an application setting named global_search_SCOPE_enabled that defaults to true under the search jsonb accessor in app/models/application_setting.rb.
Add an entry in JSON schema validator file application_setting_search.json
Add the setting checkbox in the Admin UI by creating an entry in global_search_settings_checkboxes method in ApplicationSettingsHelper.
Add it to the global_search_enabled_for_scope? method in SearchService.
Remember that EE-only settings should be added in the EE versions of the files

Results classes

The search results class available are:

Search type	Search level	Class
Basic search	global	`Gitlab::SearchResults`
Basic search	group	`Gitlab::GroupSearchResults`
Basic search	project	`Gitlab::ProjectSearchResults`
Advanced search	global	`Gitlab::Elastic::SearchResults`
Advanced search	group	`Gitlab::Elastic::GroupSearchResults`
Advanced search	project	`Gitlab::Elastic::ProjectSearchResults`
Exact code search	global	`Search::Zoekt::SearchResults`
Exact code search	group	`Search::Zoekt::SearchResults`
Exact code search	project	`Search::Zoekt::SearchResults`
All search types	All levels	`Search::EmptySearchResults`

The result class returns the following data:

objects - paginated from Elasticsearch transformed into database records or POROs
formatted_count - document count returned from Elasticsearch
highlight_map - map of highlighted fields from Elasticsearch
failed? - if a failure occurred
error - error message returned from Elasticsearch
aggregations - (optional) aggregations from Elasticsearch

New scopes must add support to these methods within Gitlab::Elastic::SearchResults class:

objects
formatted_count
highlight_map
failed?
error

Updating an existing scope

Updates may include adding and removing document fields or changes to authorization. To update an existing scope, find the code used to generate queries and JSON for indexing.

Queries are generated in QueryBuilder classes
Indexed documents are built in Reference classes

We also support a legacy Proxy framework:

Queries are generated in ClassProxy classes
Indexed documents are built in InstanceProxy classes

Always aim to create new search filters in the QueryBuilder framework, even if they are used in the legacy framework.

Adding a field

Add the field to the index

Add the field to the index mapping to add it newly created indices and create a migration to add the field to existing indices in the same MR to avoid mapping schema drift. Use the MigrationUpdateMappingsHelper
Populate the new field in the document JSON. The code must check the migration is complete using ::Elastic::DataMigrationService.migration_has_finished?
Bump the SCHEMA_VERSION for the document JSON. The format is year and version number: YYVV
Create a migration to backfill the field in the index. If it's a not-nullable field, use MigrationBackfillHelper, or MigrationReindexBasedOnSchemaVersion if it's a nullable field.

If the new field is an associated record

Update specs for Elastic::ProcessBookkeepingService create associated records
Update N+1 specs for preload_search_data to create associated data records
Review Updating dependent associations in the index

Expose the field to the search service

Add the filter to the Search::Filter concern. The concern is used in the Search::GlobalService, Search::GroupService and Search::ProjectService.
Pass the field for the scope by updating the scope_options method. The method is defined in Gitlab::Elastic::SearchResults with overrides in Gitlab::Elastic::GroupSearchResults and Gitlab::Elastic::ProjectSearchResults.
Use the field in the query builder by adding an existing filter or creating a new one.
Track the filter usage in searches in the SearchController

Changing mapping of an existing field

Update the field type in the index mapping to change it for newly created indices
Bump the SCHEMA_VERSION for the document JSON. The format is year and version number: YYVV
Create a migration to reindex all documents using Zero downtime reindexing. Use the Search::Elastic::MigrationReindexTaskHelper

Changing field content

Update the field content in the document JSON
Bump the SCHEMA_VERSION for the document JSON. The format is year and version number: YYVV
Create a migration to update documents. Use the MigrationReindexBasedOnSchemaVersion

Cleaning up documents from an index

This may be used if documents are split from one index into separate indices or to remove data left in the index due to bugs.

Bump the SCHEMA_VERSION for the document JSON. The format is year and version number: YYVV
Create a migration to index all records. Use the MigrationDatabaseBackfillHelper
Create a migration to remove all documents with the previous SCHEMA_VERSION. Use the MigrationDeleteBasedOnSchemaVersion

Removing a field

The removal must be split across multiple milestones to support multi-version compatibility. To avoid dynamic mapping errors, the field must be removed from all documents before a Zero downtime reindexing.

Milestone M:

Remove the field from the index mapping to remove it from newly created indices
Stop populating the field in the document JSON
Bump the SCHEMA_VERSION for the document JSON. The format is year and version number: YYVV
Remove any filters which use the field from the query builder
Update the scope_options method to remove the field for the scope you are updating. The method is defined in Gitlab::Elastic::SearchResults with overrides in Gitlab::Elastic::GroupSearchResults and Gitlab::Elastic::ProjectSearchResults.

If the field is not used by other scopes:

Remove the field from the Search::Filter concern. The concern is used in the Search::GlobalService, Search::GroupService, and Search::ProjectService.
Remove filter tracking in searches in the SearchController

Milestone M+1:

Create a migration to remove the field from all documents in the index. Use the MigrationRemoveFieldsHelper
Create a migration to reindex all documents with the field removed using Zero downtime reindexing. Use the Search::Elastic::MigrationReindexTaskHelper

Updating authorization

In the QueryBuilder framework, authorization is handled at the project level with the by_search_level_and_membership filter and at the group level with the by_search_level_and_group_membership filter.

In the legacy Proxy framework, the authorization is handled inside the class.

Both frameworks use Search::GroupsFinder and Search::ProjectsFinder to query the groups and projects a user has direct access to search. Search relies upon group and project visibility level and feature access level settings for each scope. See roles and permissions documentation for more information.

Query builder framework

The query builder framework is used to build Elasticsearch queries. We also support a legacy query framework implemented in the Elastic::Latest::ApplicationClassProxy class and classes that inherit it.

[!note] New document types must use the query builder framework.

Creating a query

A query is built using:

a query from Search::Elastic::Queries
one or more filters from ::Search::Elastic::Filters
(optional) aggregations from ::Search::Elastic::Aggregations
one or more formats from ::Search::Elastic::Formats

New scopes must create a new query builder class that inherits from Search::Elastic::QueryBuilder.

The query builder framework provides a collection of pre-built filters to handle common search scenarios. These filters simplify the process of constructing complex query conditions without having to write raw Elasticsearch query DSL.

Creating a filter

Filters are essential components in building effective Elasticsearch queries. They help narrow down search results without affecting the relevance scoring.

All filters must be documented.
Filters are created as class level methods in Search::Elastic::Filters
The method should start with by_.
The method must take query_hash and options parameters only.

query_hash is expected to contain a hash with this format.

json

 { "query":
   { "bool":
     {
       "must": [],
       "must_not": [],
       "should": [],
       "filters": [],
       "minimum_should_match": null
     }
   }
 }

Use add_filter to add the filter to the query hash. Filters should add to the filters to avoid calculating score. The score calculation is done by the query itself.

Use context.name(:filters) around the filter to add a name to the filter. This helps identify which part of a query and filter have allowed a result to be returned by the search

ruby

  def by_new_filter_type(query_hash:, options:)
      filter_selected_value = options[:field_value]

      context.name(:filters) do
        add_filter(query_hash, :query, :bool, :filter) do
          { term: { field_name: { _name: context.name(:field_name), value: filter_selected_value } } }
        end
      end
  end

Understanding Queries vs Filters

Queries in Elasticsearch serve two key purposes: filtering documents and calculating relevance scores. When building search functionality:

Queries are essential when relevance scoring is required to rank results by how well they match search criteria. They use the Boolean query's must, should, and must_not clauses, all of which influence the document's final relevance score.
Filters (within query context) determine whether documents appear in search results without affecting their score. For search operations where results only need to be included/excluded without ranking by relevance, using filters alone is more efficient and performs better at scale.

Choose the appropriate approach based on your search requirements - use queries with scoring clauses for ranked results, and rely on filters for simple inclusion/exclusion logic.

Filter Requirements and Usage

To use any filter:

The index mapping must include all required fields specified in each filter's documentation
Pass the appropriate parameters via the options hash when calling the filter
Each filter will generate the appropriate JSON structure and add it to your query_hash

Filters can be composed together to create sophisticated search queries while maintaining readable and maintainable code.

Sending queries to Elasticsearch

The queries are sent to ::Gitlab::Search::Client from Gitlab::Elastic::SearchResults. Results are parsed through a Search::Elastic::ResponseMapper to translate the response from Elasticsearch.

Model requirements

The model must respond to the to_ability_name method so that the redaction logic can check if it has Ability.allowed?(current_user, :"read_#{object.to_ability_name}", object)?. The method must be added if it does not exist.

The model must define a preload_search_data scope to avoid N+1s.

Available Queries

All query builders must return a standardized query_hash structure that conforms to Elasticsearch's Boolean query syntax. The Search::Elastic::BoolExpr class provides an interface for constructing Boolean queries.

The required query hash structure is:

json

{
  "query": {
    "bool": {
      "must": [],
      "must_not": [],
      "should": [],
      "filters": [],
      "minimum_should_match": null
    }
  }
}

`by_iid`

Query by iid field and document type. Requires type and iid fields.

json

{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "iid": {
              "_name": "milestone:related:iid",
              "value": 1
            }
          }
        },
        {
          "term": {
            "type": {
              "_name": "doc:is_a:milestone",
              "value": "milestone"
            }
          }
        }
      ]
    }
  }
}

`by_full_text`

Performs a full text search. This query will use by_multi_match_query or by_simple_query_string if Advanced search syntax is used in the query string.

`by_multi_match_query`

Uses multi_match Elasticsearch API. Can be customized with the following options:

count_only - uses the Boolean query clause filter. Scoring and highlighting are not performed.
query - if no query is passed, uses match_all Elasticsearch API
keyword_match_clause - if :should is passed, uses the Boolean query clause should. Default: must clause

json

{
  "query": {
    "bool": {
      "must": [
        {
          "bool": {
            "must": [],
            "must_not": [],
            "should": [
              {
                "multi_match": {
                  "_name": "project:multi_match:and:search_terms",
                  "fields": [
                    "name^10",
                    "name_with_namespace^2",
                    "path_with_namespace",
                    "path^9",
                    "description"
                  ],
                  "query": "search",
                  "operator": "and",
                  "lenient": true
                }
              },
              {
                "multi_match": {
                  "_name": "project:multi_match_phrase:search_terms",
                  "type": "phrase",
                  "fields": [
                    "name^10",
                    "name_with_namespace^2",
                    "path_with_namespace",
                    "path^9",
                    "description"
                  ],
                  "query": "search",
                  "lenient": true
                }
              }
            ],
            "filter": [],
            "minimum_should_match": 1
          }
        }
      ],
      "must_not": [],
      "should": [],
      "filter": [],
      "minimum_should_match": null
    }
  }
}

`by_simple_query_string`

Uses simple_query_string Elasticsearch API. Can be customized with the following options:

count_only - uses the Boolean query clause filter. Scoring and highlighting are not performed.
query - if no query is passed, uses match_all Elasticsearch API
keyword_match_clause - if :should is passed, uses the Boolean query clause should. Default: must clause

json

{
  "query": {
    "bool": {
      "must": [
        {
          "simple_query_string": {
            "_name": "project:match:search_terms",
            "fields": [
              "name^10",
              "name_with_namespace^2",
              "path_with_namespace",
              "path^9",
              "description"
            ],
            "query": "search",
            "lenient": true,
            "default_operator": "and"
          }
        }
      ],
      "must_not": [],
      "should": [],
      "filter": [],
      "minimum_should_match": null
    }
  }
}

Available Filters

The following sections detail each available filter, its required fields, supported options, and example output.

`by_type`

Requires type field. Query with doc_type in options.

json

{
  "term": {
    "type": {
      "_name": "filters:doc:is_a:milestone",
      "value": "milestone"
    }
  }
}

`by_group_level_confidentiality`

Requires current_user and group_ids fields. Query based on the permissions to user to read confidential group entities.

json

{
  "bool": {
    "must": [
      {
        "term": {
          "confidential": {
            "value": true,
            "_name": "confidential:true"
          }
        }
      },
      {
        "terms": {
          "namespace_id": [
            1
          ],
          "_name": "groups:can:read_confidential_work_items"
        }
      }
    ]
  },
  "should": {
    "term": {
      "confidential": {
        "value": false,
        "_name": "confidential:false"
      }
    }
  }
}

`by_project_confidentiality`

Requires confidential, author_id, assignee_id, project_id fields. Query with confidential in options.

json

{
  "bool": {
    "should": [
      {
        "term": {
          "confidential": {
            "_name": "filters:confidentiality:projects:non_confidential",
            "value": false
          }
        }
      },
      {
        "bool": {
          "must": [
            {
              "term": {
                "confidential": {
                  "_name": "filters:confidentiality:projects:confidential",
                  "value": true
                }
              }
            },
            {
              "bool": {
                "should": [
                  {
                    "term": {
                      "author_id": {
                        "_name": "filters:confidentiality:projects:confidential:as_author",
                        "value": 1
                      }
                    }
                  },
                  {
                    "term": {
                      "assignee_id": {
                        "_name": "filters:confidentiality:projects:confidential:as_assignee",
                        "value": 1
                      }
                    }
                  },
                  {
                    "terms": {
                      "_name": "filters:confidentiality:projects:confidential:project:membership:id",
                      "project_id": [
                        12345
                      ]
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    ]
  }
}

`by_combined_confidentiality`

Requires search_level field and at least one of use_group_authorization or use_project_authorization. Query with confidential in options. This filter combines by_project_confidentiality and by_group_level_confidentiality into one query if both use_group_authorization and use_project_authorization are provided. See those methods for required fields.

json

[
  {
    "bool": {
      "should": [
        {
          "bool": {
            "filter": [
              {
                "bool": {
                  "should": [
                    {
                      "term": {
                        "confidential": {
                          "_name": "filters:confidentiality:projects:non_confidential",
                          "value": false
                        }
                      }
                    },
                    {
                      "bool": {
                        "must": [
                          {
                            "term": {
                              "confidential": {
                                "_name": "filters:confidentiality:projects:confidential",
                                "value": true
                              }
                            }
                          },
                          {
                            "bool": {
                              "should": [
                                {
                                  "term": {
                                    "author_id": {
                                      "_name": "filters:confidentiality:projects:confidential:as_author",
                                      "value": 278964
                                    }
                                  }
                                },
                                {
                                  "term": {
                                    "assignee_id": {
                                      "_name": "filters:confidentiality:projects:confidential:as_assignee",
                                      "value": 278964
                                    }
                                  }
                                },
                                {
                                  "terms": {
                                    "_name": "filters:confidentiality:projects:confidential:project:membership:id",
                                    "project_id": []
                                  }
                                }
                              ]
                            }
                          }
                        ]
                      }
                    }
                  ]
                }
              }
            ]
          }
        },
        {
          "bool": {
            "filter": [
              {
                "bool": {
                  "should": [
                    {
                      "bool": {
                        "_name": "filters:confidentiality:groups:non_confidential:public",
                        "must": [
                          {
                            "term": {
                              "confidential": {
                                "value": false
                              }
                            }
                          },
                          {
                            "term": {
                              "namespace_visibility_level": {
                                "value": 20
                              }
                            }
                          }
                        ]
                      }
                    },
                    {
                      "bool": {
                        "_name": "filters:confidentiality:groups:non_confidential:internal",
                        "must": [
                          {
                            "term": {
                              "confidential": {
                                "value": false
                              }
                            }
                          },
                          {
                            "term": {
                              "namespace_visibility_level": {
                                "value": 10
                              }
                            }
                          }
                        ]
                      }
                    },
                    {
                      "bool": {
                        "_name": "filters:confidentiality:groups:non_confidential:private",
                        "must": [
                          {
                            "term": {
                              "confidential": {
                                "value": false
                              }
                            }
                          }
                        ],
                        "should": [
                          {
                            "prefix": {
                              "traversal_ids": {
                                "_name": "filters:confidentiality:groups:non_confidential:private:ancestry_filter:descendants",
                                "value": "9970-"
                              }
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    },
                    {
                      "bool": {
                        "_name": "filters:confidentiality:groups:non_confidential:private",
                        "must": [
                          {
                            "term": {
                              "confidential": {
                                "value": false
                              }
                            }
                          },
                          {
                            "terms": {
                              "_name": "filters:confidentiality:groups:non_confidential:private:project:membership",
                              "namespace_id": [
                                9971
                              ]
                            }
                          }
                        ]
                      }
                    },
                    {
                      "bool": {
                        "_name": "filters:confidentiality:groups:confidential:private",
                        "must": [
                          {
                            "term": {
                              "confidential": {
                                "value": true
                              }
                            }
                          }
                        ],
                        "should": [
                          {
                            "prefix": {
                              "traversal_ids": {
                                "_name": "filters:confidentiality:groups:confidential:private:ancestry_filter:descendants",
                                "value": "9970-"
                              }
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
]

`by_note_confidentiality`

Applies confidentiality filters for notes. Notes have two levels of confidentiality:

Note's own confidentiality (confidential field)
Issue's confidentiality (issue.confidential, issue.author_id, issue.assignee_id)

Requires confidential, issue.confidential, issue.author_id, issue.assignee_id, project_id, and traversal_ids fields.

A note is visible if ANY of these conditions are met:

It's on a non-confidential issue AND the note isn't confidential
It's on a confidential issue but user is author/assignee/has project access via project_id or traversal_ids
The note is confidential but user has project access via project_id or traversal_ids

This filter uses both project_id terms and traversal_ids-based authorization for efficient group-level searches.

json

{
  "bool": {
    "minimum_should_match": 1,
    "should": [
      {
        "bool": {
          "filter": [
            {
              "bool": {
                "_name": "filters:confidentiality:notes:not_on_issue_or_not_confidential",
                "should": [
                  {
                    "bool": {
                      "_name": "filters:confidentiality:notes:not_on_issue",
                      "must_not": [{ "exists": { "field": "issue" } }]
                    }
                  },
                  {
                    "term": {
                      "issue.confidential": {
                        "_name": "filters:confidentiality:notes:non_confidential_issue",
                        "value": false
                      }
                    }
                  }
                ]
              }
            },
            {
              "bool": {
                "_name": "filters:confidentiality:notes:not_confidential",
                "should": [
                  { "bool": { "must_not": [{ "exists": { "field": "confidential" } }] } },
                  { "term": { "confidential": false } }
                ]
              }
            }
          ]
        }
      },
      {
        "bool": {
          "filter": [
            {
              "term": {
                "issue.confidential": {
                  "_name": "filters:confidentiality:notes:issue:confidential",
                  "value": true
                }
              }
            },
            {
              "bool": {
                "_name": "filters:confidentiality:notes:not_confidential",
                "should": [
                  { "bool": { "must_not": [{ "exists": { "field": "confidential" } }] } },
                  { "term": { "confidential": false } }
                ]
              }
            },
            {
              "bool": {
                "minimum_should_match": 1,
                "should": [
                  {
                    "term": {
                      "issue.author_id": {
                        "_name": "filters:confidentiality:notes:confidential:as_author",
                        "value": 1
                      }
                    }
                  },
                  {
                    "term": {
                      "issue.assignee_id": {
                        "_name": "filters:confidentiality:notes:confidential:as_assignee",
                        "value": 1
                      }
                    }
                  },
                  {
                    "terms": {
                      "_name": "filters:confidentiality:notes:private:project:member",
                      "project_id": [1]
                    }
                  },
                  {
                    "bool": {
                      "minimum_should_match": 1,
                      "should": [
                        {
                          "prefix": {
                            "traversal_ids": {
                              "_name": "filters:confidentiality:notes:private:ancestry_filter:descendants",
                              "value": "123-"
                            }
                          }
                        }
                      ]
                    }
                  }
                ]
              }
            }
          ]
        }
      },
      {
        "bool": {
          "filter": [
            {
              "term": {
                "confidential": {
                  "_name": "filters:confidentiality:notes:confidential",
                  "value": true
                }
              }
            }
          ],
          "minimum_should_match": 1,
          "should": [
            {
              "terms": {
                "_name": "filters:confidentiality:notes:private:project:member",
                "project_id": [1]
              }
            },
            {
              "bool": {
                "minimum_should_match": 1,
                "should": [
                  {
                    "prefix": {
                      "traversal_ids": {
                        "_name": "filters:confidentiality:notes:private:ancestry_filter:descendants",
                        "value": "123-"
                      }
                    }
                  }
                ]
              }
            }
          ]
        }
      }
    ]
  }
}

`by_label_ids`

Requires label_ids field. Query with label_names in options.

json

{
  "bool": {
    "must": [
      {
        "terms": {
          "_name": "filters:label_ids",
          "label_ids": [
            1
          ]
        }
      }
    ]
  }
}

`by_archived`

Requires archived field. Query with search_level and include_archived in options.

json

{
  "bool": {
    "_name": "filters:non_archived",
    "should": [
      {
        "bool": {
          "filter": {
            "term": {
              "archived": {
                "value": false
              }
            }
          }
        }
      },
      {
        "bool": {
          "must_not": {
            "exists": {
              "field": "archived"
            }
          }
        }
      }
    ]
  }
}

`by_state`

Requires state field. Supports values: all, opened, closed, and merged. Query with state in options.

json

{
  "match": {
    "state": {
      "_name": "filters:state",
      "query": "opened"
    }
  }
}

`by_not_hidden`

Requires hidden field. Not applied for admins.

json

{
  "term": {
    "hidden": {
      "_name": "filters:not_hidden",
      "value": false
    }
  }
}

`by_work_item_type_ids`

Requires work_item_type_id field. Query with work_item_type_ids or not_work_item_type_ids in options.

json

{
  "bool": {
    "must_not": {
      "terms": {
        "_name": "filters:not_work_item_type_ids",
        "work_item_type_id": [
          8
        ]
      }
    }
  }
}

`by_author`

Requires author_id field. Query with author_username or not_author_username in options.

json

{
  "bool": {
    "should": [
      {
        "term": {
          "author_id": {
            "_name": "filters:author",
            "value": 1
          }
        }
      }
    ],
    "minimum_should_match": 1
  }
}

`by_target_branch`

Requires target_branch field. Query with target_branch or not_target_branch in options.

json

{
  "bool": {
    "should": [
      {
        "term": {
          "target_branch": {
            "_name": "filters:target_branch",
            "value": "master"
          }
        }
      }
    ],
    "minimum_should_match": 1
  }
}

`by_source_branch`

Requires source_branch field. Query with source_branch or not_source_branch in options.

json

{
  "bool": {
    "should": [
      {
        "term": {
          "source_branch": {
            "_name": "filters:source_branch",
            "value": "master"
          }
        }
      }
    ],
    "minimum_should_match": 1
  }
}

`by_search_level_and_group_membership`

Requires current_user, group_ids, traversal_id, search_level fields. Query with search_level and filter on namespace_visibility_level based on permissions user has for each group.

This filter can be used in place of by_search_level_and_membership if the data being searched does not contain the project_id field.

[!note] Examples are shown for an authenticated user. The JSON may be different for users with authorizations, admins, external, or anonymous users

global

json

{
  "bool": {
    "should": [
      {
        "bool": {
          "filter": [
            {
              "term": {
                "namespace_visibility_level": {
                  "value": 20,
                  "_name": "filters:namespace_visibility_level:public"
                }
              }
            }
          ]
        }
      },
      {
        "bool": {
          "filter": [
            {
              "term": {
                "namespace_visibility_level": {
                  "value": 10,
                  "_name": "filters:namespace_visibility_level:internal"
                }
              }
            }
          ]
        }
      },
      {
        "bool": {
          "filter": [
            {
              "term": {
                "namespace_visibility_level": {
                  "value": 0,
                  "_name": "filters:namespace_visibility_level:private"
                }
              }
            },
            {
              "terms": {
                "namespace_id": [
                  33,
                  22
                ]
              }
            }
          ]
        }
      }
    ],
    "minimum_should_match": 1
  }
}

group

json

[
  {
    "bool": {
      "_name": "filters:level:group",
      "minimum_should_match": 1,
      "should": [
        {
          "prefix": {
            "traversal_ids": {
              "_name": "filters:level:group:ancestry_filter:descendants",
              "value": "22-"
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "should": [
        {
          "bool": {
            "filter": [
              {
                "term": {
                  "namespace_visibility_level": {
                    "value": 20,
                    "_name": "filters:namespace_visibility_level:public"
                  }
                }
              }
            ]
          }
        },
        {
          "bool": {
            "filter": [
              {
                "term": {
                  "namespace_visibility_level": {
                    "value": 10,
                    "_name": "filters:namespace_visibility_level:internal"
                  }
                }
              }
            ]
          }
        },
        {
          "bool": {
            "filter": [
              {
                "term": {
                  "namespace_visibility_level": {
                    "value": 0,
                    "_name": "filters:namespace_visibility_level:private"
                  }
                }
              },
              {
                "terms": {
                  "namespace_id": [
                    22
                  ]
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  },
  {
    "bool": {
      "_name": "filters:level:group",
      "minimum_should_match": 1,
      "should": [
        {
          "prefix": {
            "traversal_ids": {
              "_name": "filters:level:group:ancestry_filter:descendants",
              "value": "22-"
            }
          }
        }
      ]
    }
  }
]

`by_search_level_and_membership`

Requires project_id, traversal_id and project visibility (defaulting to visibility_level but can set with the project_visibility_level_field option) fields. Supports feature *_access_level fields. Query with search_level and optionally project_ids, group_ids, features, and current_user in options.

Filtering is applied for:

search level for global, group, or project
membership for direct membership to groups and projects or shared membership through direct access to a group
any feature access levels passed through features

[!note] Examples are shown for a logged in user. The JSON may be different for users with authorizations, admins, external, or anonymous users

global

json

{
  "bool": {
    "_name": "filters:permissions:global",
    "should": [
      {
        "bool": {
          "must": [
            {
              "terms": {
                "_name": "filters:permissions:global:visibility_level:public_and_internal",
                "visibility_level": [
                  20,
                  10
                ]
              }
            }
          ],
          "should": [
            {
              "terms": {
                "_name": "filters:permissions:global:repository_access_level:enabled",
                "repository_access_level": [
                  20
                ]
              }
            }
          ],
          "minimum_should_match": 1
        }
      },
      {
        "bool": {
          "must": [
            {
              "bool": {
                "should": [
                  {
                    "terms": {
                      "_name": "filters:permissions:global:repository_access_level:enabled_or_private",
                      "repository_access_level": [
                        20,
                        10
                      ]
                    }
                  }
                ],
                "minimum_should_match": 1
              }
            }
          ],
          "should": [
            {
              "prefix": {
                "traversal_ids": {
                  "_name": "filters:permissions:global:ancestry_filter:descendants",
                  "value": "123-"
                }
              }
            },
            {
              "terms": {
                "_name": "filters:permissions:global:project:member",
                "project_id": [
                  456
                ]
              }
            }
          ],
          "minimum_should_match": 1
        }
      }
    ],
    "minimum_should_match": 1
  }
}

group

json

[
  {
    "bool": {
      "_name": "filters:level:group",
      "minimum_should_match": 1,
      "should": [
        {
          "prefix": {
            "traversal_ids": {
              "_name": "filters:level:group:ancestry_filter:descendants",
              "value": "123-"
            }
          }
        }
      ]
    }
  },
  {
    "bool": {
      "_name": "filters:permissions:group",
      "should": [
        {
          "bool": {
            "must": [
              {
                "terms": {
                  "_name": "filters:permissions:group:visibility_level:public_and_internal",
                  "visibility_level": [
                    20,
                    10
                  ]
                }
              }
            ],
            "should": [
              {
                "terms": {
                  "_name": "filters:permissions:group:repository_access_level:enabled",
                  "repository_access_level": [
                    20
                  ]
                }
              }
            ],
            "minimum_should_match": 1
          }
        },
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "terms": {
                        "_name": "filters:permissions:group:repository_access_level:enabled_or_private",
                        "repository_access_level": [
                          20,
                          10
                        ]
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ],
            "should": [
              {
                "prefix": {
                  "traversal_ids": {
                    "_name": "filters:permissions:group:ancestry_filter:descendants",
                    "value": "123-"
                  }
                }
              }
            ],
            "minimum_should_match": 1
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
]

project

json

[
  {
    "bool": {
      "_name": "filters:level:project",
      "must": {
        "terms": {
          "project_id": [
            456
          ]
        }
      }
    }
  },
  {
    "bool": {
      "_name": "filters:permissions:project",
      "should": [
        {
          "bool": {
            "must": [
              {
                "terms": {
                  "_name": "filters:permissions:project:visibility_level:public_and_internal",
                  "visibility_level": [
                    20,
                    10
                  ]
                }
              }
            ],
            "should": [
              {
                "terms": {
                  "_name": "filters:permissions:project:repository_access_level:enabled",
                  "repository_access_level": [
                    20
                  ]
                }
              }
            ],
            "minimum_should_match": 1
          }
        },
        {
          "bool": {
            "must": [
              {
                "bool": {
                  "should": [
                    {
                      "terms": {
                        "_name": "filters:permissions:project:repository_access_level:enabled_or_private",
                        "repository_access_level": [
                          20,
                          10
                        ]
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ],
            "should": [
              {
                "prefix": {
                  "traversal_ids": {
                    "_name": "filters:permissions:project:ancestry_filter:descendants",
                    "value": "123-"
                  }
                }
              }
            ],
            "minimum_should_match": 1
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
]

`by_combined_search_level_and_membership`

Requires search_level field and at least one of use_group_authorization or use_project_authorization. This filter combines by_search_level_and_membership and by_search_level_and_group_membership into one query if both use_group_authorization and use_project_authorization are provided. See those methods for required fields.

json

[
  {
    "bool": {
      "should": [
        {
          "bool": {
            "filter": [
              {
                "bool": {
                  "should": [
                    {
                      "bool": {
                        "should": [
                          {
                            "prefix": {
                              "traversal_ids": {
                                "_name": "filters:permissions:global:private_access:ancestry_filter:descendants",
                                "value": "9970-"
                              }
                            }
                          }
                        ],
                        "filter": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:private_access:issues_access_level:enabled_or_private",
                              "issues_access_level": [
                                20,
                                10
                              ]
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    },
                    {
                      "bool": {
                        "filter": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:private_access:issues_access_level:enabled_or_private",
                              "issues_access_level": [
                                20,
                                10
                              ]
                            }
                          },
                          {
                            "terms": {
                              "_name": "filters:permissions:global:private_access:project:member",
                              "project_id": [
                                278964
                              ]
                            }
                          }
                        ]
                      }
                    },
                    {
                      "bool": {
                        "should": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:issues_access_level:enabled",
                              "issues_access_level": [
                                20
                              ]
                            }
                          }
                        ],
                        "filter": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:project_visibility_level:public_and_internal",
                              "project_visibility_level": [
                                20,
                                10
                              ]
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ]
          }
        },
        {
          "bool": {
            "filter": [
              {
                "bool": {
                  "_name": "filters:permissions:global",
                  "should": [
                    {
                      "bool": {
                        "filter": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:namespace_visibility_level:public_and_internal",
                              "namespace_visibility_level": [
                                20,
                                10
                              ]
                            }
                          }
                        ]
                      }
                    },
                    {
                      "bool": {
                        "must": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:namespace_visibility_level:private",
                              "namespace_visibility_level": [
                                0
                              ]
                            }
                          }
                        ],
                        "should": [
                          {
                            "prefix": {
                              "traversal_ids": {
                                "_name": "filters:permissions:global:ancestry_filter:descendants",
                                "value": "9970-"
                              }
                            }
                          }
                        ],
                        "minimum_should_match": 1
                      }
                    },
                    {
                      "bool": {
                        "must": [
                          {
                            "terms": {
                              "_name": "filters:permissions:global:namespace_visibility_level:private",
                              "namespace_visibility_level": [
                                0
                              ]
                            }
                          },
                          {
                            "terms": {
                              "_name": "filters:permissions:global:project:membership",
                              "namespace_id": [
                                9971
                              ]
                            }
                          }
                        ]
                      }
                    }
                  ],
                  "minimum_should_match": 1
                }
              }
            ]
          }
        }
      ],
      "minimum_should_match": 1
    }
  }
]

`by_user_accessible_namespaces`

Filters documents based on user access to namespaces (groups and projects). This filter is specific to user search queries and handles authorization at the namespace level for global, group, and project search levels.

Required fields:

namespace_ancestry_ids - keyword field populated from Namespace#elastic_namespace_ancestry / Project#elastic_namespace_ancestry (including the p<id> project segment), used for prefix and terms queries in namespace hierarchy filtering
current_user - the user performing the search (for global scope)
search_level - one of :global, :group, or :project

Optional fields:

group_id - group ID for group-level search
project_id - project ID for project-level search
autocomplete - boolean flag for autocomplete searches

Behavior by search level:

Global: Returns all users
- With autocomplete: Returns users accessible through authorized groups and projects. Uses traversal_id_prefixes to match group hierarchies and direct project membership.
Group: Returns users in the specified group and its descendants, plus users in ancestor groups.
Project: Returns users in the specified project and its ancestor groups.

Example for global search:

json

{
  "bool": {
    "should": [
      {
        "prefix": {
          "namespace_ancestry_ids": {
            "_name": "namespace:ancestry_filter:descendants",
            "value": "285-"
          }
        }
      },
      {
        "prefix": {
          "namespace_ancestry_ids": {
            "_name": "namespace:ancestry_filter:descendants",
            "value": "417-418-419-"
          }
        }
      },
      {
        "terms": {
          "namespace_ancestry_ids": [
            "417-418-419-p91-"
          ],
          "_name": "namespace:ancestry_filter:ancestors"
        }
      }
    ],
    "minimum_should_match": 1
  }
}

Example for group search in a subgroup:

json

{
  "bool": {
    "should": [
      {
        "prefix": {
          "namespace_ancestry_ids": {
            "_name": "namespace:ancestry_filter:descendants",
            "value": "807-806-805"
          }
        }
      },
      {
        "terms": {
          "namespace_ancestry_ids": [
            "807-","807-806-"
          ],
          "_name": "namespace:ancestry_filter:ancestors"
        }
      }
    ],
    "minimum_should_match": 1
  }
}

Example for project search:

json

{
  "bool": {
    "should": [
      {
        "terms": {
          "namespace_ancestry_ids": [
            "807-","807-806-", "807-806-805", "807-806-805-p123"
          ],
          "_name": "namespace:ancestry_filter:ancestors"
        }
      }
    ],
    "minimum_should_match": 1
  }
}

`by_noteable_type`

Requires noteable_type field. Query with noteable_type in options. Sets _source to only return noteable_id field.

json

{
  "term": {
    "noteable_type": {
      "_name": "filters:related:issue",
      "value": "Issue"
    }
  }
}

`by_iids`

Filters documents by multiple IID values.

Required fields:

iids - array of IID values to match

json

{
  "bool": {
    "_name": "filters:iids",
    "filter": {
      "terms": {
        "iid": [1, 2, 3]
      }
    }
  }
}

`by_closed_at`

Filters by closed date range. At least one optional field must be provided.

Optional fields:

closed_after - ISO date string for minimum closed date
closed_before - ISO date string for maximum closed date

json

{
  "bool": {
    "_name": "filters:closed_after",
    "must": {
      "range": {
        "closed_at": {
          "gte": "2025-01-01T00:00:00Z"
        }
      }
    }
  }
}

`by_created_at`

Filters by creation date range. At least one optional field must be provided.

Optional fields:

created_after - ISO date string for minimum creation date
created_before - ISO date string for maximum creation date

json

{
  "bool": {
    "_name": "filters:created_after",
    "must": {
      "range": {
        "created_at": {
          "gte": "2025-01-01T00:00:00Z"
        }
      }
    }
  }
}

`by_updated_at`

Filters by update date range. At least one optional field must be provided.

Optional fields:

updated_after - ISO date string for minimum update date
updated_before - ISO date string for maximum update date

json

{
  "bool": {
    "_name": "filters:updated_after",
    "must": {
      "range": {
        "updated_at": {
          "gte": "2025-01-01T00:00:00Z"
        }
      }
    }
  }
}

`by_due_date`

Filters by due date range. At least one optional field must be provided.

Optional fields:

due_after - ISO date string for minimum due date
due_before - ISO date string for maximum due date

json

{
  "bool": {
    "_name": "filters:due_after",
    "must": {
      "range": {
        "due_date": {
          "gte": "2025-01-01T00:00:00Z"
        }
      }
    }
  }
}

`by_milestone`

Filters by milestone title or milestone presence. At least one optional field must be provided. The milestone title filters (milestone_title, not_milestone_title) and milestone presence filters (any_milestones, none_milestones) are mutually exclusive.

Optional fields:

milestone_title - array of milestone titles to include
not_milestone_title - array of milestone titles to exclude
any_milestones - boolean, filters for documents with any milestone
none_milestones - boolean, filters for documents with no milestone

Example with milestone_title:

json

{
  "bool": {
    "must": {
      "terms": {
        "_name": "filters:milestone_title",
        "milestone_title": ["18.1", "18.2"]
      }
    }
  }
}

Example with none_milestones:

json

{
  "bool": {
    "_name": "filters:none_milestones",
    "must_not": {
      "exists": {
        "field": "milestone_title"
      }
    }
  }
}

`by_milestone_state`

Filters by milestone state with temporal conditions.

Required fields:

milestone_state_filters - array containing one or more of: :upcoming, :started, :not_upcoming, :not_started

Example for :upcoming filter:

json

{
  "bool": {
    "_name": "filters:milestone_state_upcoming",
    "must": [
      {
        "term": {
          "milestone_state": "active"
        }
      },
      {
        "range": {
          "milestone_start_date": {
            "gt": "now/d"
          }
        }
      }
    ]
  }
}

Example for :started filter:

json

{
  "bool": {
    "_name": "filters:milestone_state_started",
    "must": [
      {
        "term": {
          "milestone_state": "active"
        }
      },
      {
        "bool": {
          "should": [
            {
              "range": {
                "milestone_start_date": {
                  "lte": "now/d"
                }
              }
            },
            {
              "bool": {
                "must_not": {
                  "exists": {
                    "field": "milestone_start_date"
                  }
                }
              }
            }
          ]
        }
      },
      {
        "bool": {
          "should": [
            {
              "range": {
                "milestone_due_date": {
                  "gte": "now/d"
                }
              }
            },
            {
              "bool": {
                "must_not": {
                  "exists": {
                    "field": "milestone_due_date"
                  }
                }
              }
            }
          ]
        }
      }
    ],
    "must_not": {
      "bool": {
        "must": [
          {
            "bool": {
              "must_not": {
                "exists": {
                  "field": "milestone_start_date"
                }
              }
            }
          },
          {
            "bool": {
              "must_not": {
                "exists": {
                  "field": "milestone_due_date"
                }
              }
            }
          }
        ]
      }
    }
  }
}

`by_assignees`

Filters by assignee IDs with support for various matching modes. At least one optional field must be provided.

Optional fields:

assignee_ids - array of assignee IDs that must ALL be present
not_assignee_ids - array of assignee IDs to exclude
or_assignee_ids - array of assignee IDs where ANY can match
none_assignees - boolean, filters for documents with no assignees
any_assignees - boolean, filters for documents with any assignee

Example with assignee_ids (ALL must match):

json

{
  "bool": {
    "_name": "filters:assignee_ids",
    "must": [
      {
        "term": {
          "assignee_id": 123
        }
      },
      {
        "term": {
          "assignee_id": 456
        }
      }
    ]
  }
}

Example with or_assignee_ids (ANY can match):

json

{
  "bool": {
    "must": {
      "terms": {
        "_name": "filters:or_assignee_ids",
        "assignee_id": [123, 456, 789]
      }
    }
  }
}

Example with none_assignees:

json

{
  "bool": {
    "_name": "filters:none_assignees",
    "must_not": {
      "exists": {
        "field": "assignee_id"
      }
    }
  }
}

`by_weight`

Filters by issue weight (integer value). At least one optional field must be provided.

Optional fields:

weight - exact weight value to match (integer)
not_weight - weight value to exclude (integer)
none_weight - boolean, filters for documents with no weight
any_weight - boolean, filters for documents with any weight

json

{
  "term": {
    "weight": {
      "_name": "filters:weight",
      "value": 3
    }
  }
}

`by_health_status`

Filters by health status field. At least one optional field must be provided.

Optional fields:

health_status - array of health status IDs to include
not_health_status - array of health status IDs to exclude
none_health_status - boolean, filters for documents with no health status
any_health_status - boolean, filters for documents with any health status

json

{
  "bool": {
    "must": {
      "terms": {
        "_name": "filters:health_status",
        "health_status": [1, 2]
      }
    }
  }
}

`by_label_names`

Filters by label names with support for various matching modes and scoped label wildcards. At least one optional field must be provided.

Optional fields:

label_names - array of label names that must ALL be present
not_label_names - array of label names to exclude
or_label_names - array of label names where ANY can match
none_label_names - boolean, filters for documents with no labels
any_label_names - boolean, filters for documents with any label

Supports scoped label wildcards like "workflow::*" to match all labels starting with "workflow::". The wildcard is converted to a prefix query in Elasticsearch.

Example with exact matches:

json

{
  "bool": {
    "_name": "filters:label_names",
    "must": [
      {
        "term": {
          "label_names": "advanced search"
        }
      },
      {
        "term": {
          "label_names": "GLQL"
        }
      }
    ]
  }
}

Example with scoped label wildcard:

json

{
  "bool": {
    "_name": "filters:or_label_names",
    "should": [
      {
        "prefix": {
          "label_names": "workflow::"
        }
      },
      {
        "term": {
          "label_names": "backend"
        }
      }
    ],
    "minimum_should_match": 1
  }
}

Testing scopes

Test any scope in the Rails console

ruby

search_service = ::SearchService.new(User.first, { search: 'foo', scope: 'SCOPE_NAME' })
search_service.search_objects

Permissions tests

Search code has a final security check in SearchService#redact_unauthorized_results. This prevents unauthorized results from being returned to users who don't have permission to view them. The check is done in Ruby to handle inconsistencies in Elasticsearch permissions data due to bugs or indexing delays.

New scopes must add visibility specs to ensure proper access control. To test that permissions are properly enforced, add tests using the 'search respects visibility' shared example in the EE specs:

ee/spec/services/ee/search/global_service_spec.rb
ee/spec/services/ee/search/group_service_spec.rb
ee/spec/services/ee/search/project_service_spec.rb

Zero-downtime reindexing with multiple indices

[!note] This is not applicable yet as multiple indices functionality is not fully implemented.

Currently, GitLab can only handle a single version of setting. Any setting/schema changes would require reindexing everything from scratch. Since reindexing can take a long time, this can cause search functionality downtime.

To avoid downtime, GitLab is working to support multiple indices that can function at the same time. Whenever the schema changes, the administrator will be able to create a new index and reindex to it, while searches continue to go to the older, stable index. Any data updates will be forwarded to both indices. Once the new index is ready, an administrator can mark it active, which will direct all searches to it, and remove the old index.

This is also helpful for migrating to new servers, for example, moving to/from AWS.

Currently, we are on the process of migrating to this new design. Everything is hardwired to work with one single version for now.

Deep dive resources

Elasticsearch configuration

Supported versions

Setting up your development environment

Helpful Rake tasks

Development workflow

Development tips

Debugging & troubleshooting

Debugging Elasticsearch queries

Getting flood stage disk watermark [95%] exceeded

Performance monitoring

Prometheus

Indexing queues

Logs

Performance Bar

Correlation ID and X-Opaque-Id

Architecture

Indexing Overview

Indexing Components

External Indexer

Rails Indexing Lifecycle

Search and Security

Migration framework

Framework Components

Configuration Options

Search DSL

Custom routing

Existing analyzers and tokenizers

Analyzers

path_analyzer

sha_analyzer

code_analyzer

Tokenizers

sha_tokenizer

path_tokenizer

Common gotchas

Implementation guide

Add a new document type to Elasticsearch

Recommended process for adding a new document type

Create the index

PostgreSQL to Elasticsearch mappings

Validate expected queries

Create a new Elastic Reference

Common gotchas

Data consistency

Backfilling data

Continuous updates

Transfers and deletes

Implementing search for a new document type

Recommended process for implementing search for a new document type

Search scopes

Results classes

Updating an existing scope

Adding a field

Add the field to the index

If the new field is an associated record

Expose the field to the search service

Changing mapping of an existing field

Changing field content

Cleaning up documents from an index

Removing a field

Updating authorization

Query builder framework

Creating a query

Creating a filter

Understanding Queries vs Filters

Filter Requirements and Usage

Sending queries to Elasticsearch

Model requirements

Available Queries

by_iid

by_full_text

by_multi_match_query

by_simple_query_string

Available Filters

by_type

by_group_level_confidentiality

by_project_confidentiality

by_combined_confidentiality

Getting `flood stage disk watermark [95%] exceeded`

Correlation ID and `X-Opaque-Id`

`path_analyzer`

`sha_analyzer`

`code_analyzer`

`sha_tokenizer`

`path_tokenizer`

`by_iid`

`by_full_text`

`by_multi_match_query`

`by_simple_query_string`

`by_type`

`by_group_level_confidentiality`

`by_project_confidentiality`

`by_combined_confidentiality`

`by_note_confidentiality`

`by_label_ids`

`by_archived`

`by_state`

`by_not_hidden`

`by_work_item_type_ids`

`by_author`

`by_target_branch`

`by_source_branch`

`by_search_level_and_group_membership`

`by_search_level_and_membership`

`by_combined_search_level_and_membership`

`by_user_accessible_namespaces`

`by_noteable_type`

`by_iids`

`by_closed_at`

`by_created_at`

`by_updated_at`

`by_due_date`

`by_milestone`

`by_milestone_state`

`by_assignees`

`by_weight`

`by_health_status`

`by_label_names`