Troubleshooting Elasticsearch indexing and searching

Tier: Premium, Ultimate
Offering: GitLab Self-Managed, GitLab Dedicated

When working with Elasticsearch indexing or searching, you might encounter the following issues.

Create an empty index

For indexing issues, try first to create an empty index. Check the Elasticsearch instance to see if the gitlab-production index exists. If it does, manually delete the index on the Elasticsearch instance and try to recreate it from the recreate_index Rake task.

If you still encounter issues, try to create an index manually on the Elasticsearch instance. If you:

Cannot create indices, contact your Elasticsearch administrator.
Can create indices, contact GitLab Support.

Check the status of indexed projects

You can check for errors during project indexing. Errors might occur on:

The GitLab instance: if you cannot fix them yourself, contact GitLab Support for guidance.
The Elasticsearch instance: if the error is not listed, contact your Elasticsearch administrator.

If indexing does not return errors, check the status of indexed projects with the following Rake tasks:

sudo gitlab-rake gitlab:elastic:index_projects_status for the overall status
sudo gitlab-rake gitlab:elastic:projects_not_indexed for specific projects that are not indexed

If indexing is:

Complete, contact GitLab Support.
Not complete, try to reindex that project by running sudo gitlab-rake gitlab:elastic:index_projects ID_FROM=<project ID> ID_TO=<project ID>.

If reindexing the project shows errors on:

The GitLab instance: contact GitLab Support.
The Elasticsearch instance or no errors at all: contact your Elasticsearch administrator to check the instance.

No search results after updating GitLab

We continuously make updates to our indexing strategies and aim to support newer versions of Elasticsearch. When indexing changes are made, you might have to reindex after updating GitLab.

No search results after indexing all repositories

[!note] Don't use these instructions for scenarios that only index a subset of namespaces.

Make sure you indexed all the database data.

If there aren't any results (hits) in the UI search, check if you are seeing the same results via the rails console (sudo gitlab-rails console):

ruby

u = User.find_by_username('your-username')
s = SearchService.new(u, {:search => 'search_term', :scope => 'blobs'})
pp s.search_objects.to_a

Beyond that, check via the Elasticsearch Search API to see if the data shows up on the Elasticsearch side:

shell

curl --request GET <elasticsearch_server_ip>:9200/gitlab-production/_search?q=<search_term>

More complex Elasticsearch API calls are also possible.

If the results:

Sync up, check that you are using supported syntax. Advanced search does not support exact substring matching.
Do not match up, this indicates a problem with the documents generated from the project. It is best to reindex that project.

See Elasticsearch Index Scopes for more information on searching for specific types of data.

No search results after enabling advanced search with low concurrency

After you enable advanced search, you might find that documents are not being indexed and code is not searchable. You might see a message in Sidekiq logs similar to the following:

json

"job_status":"concurrency_limit","message":"Search::Elastic::CommitIndexerWorker JID-352e0b9ee88af9f455c69b81: concurrency_limit: paused"

To resolve this issue:

Use the Rake task gitlab-rake gitlab:elastic:info to check the status of Indexing queues.
If Concurrency limit code queue is non-zero, check the Code indexing concurrency value. Values that are too low can prevent indexing from progressing. Consider increasing this value and checking progress with the Rake task.

No search results after switching Elasticsearch servers

To reindex the database, repositories, and wikis, index the instance.

Indexing fails with `error: elastic: Error 429 (Too Many Requests)`

If Search::Elastic::CommitIndexerWorker Sidekiq workers are failing with this error during indexing, it usually means that Elasticsearch is unable to keep up with the concurrency of indexing request. To address change the following settings:

To decrease the indexing throughput you can decrease Bulk request concurrency (see Advanced search settings). This is set to 10 by default, but you change it to as low as 1 to reduce the number of concurrent indexing operations.
If changing Bulk request concurrency didn't help, you can use the routing rules option to limit indexing jobs only to specific Sidekiq nodes, which should reduce the number of indexing requests.

Error: `Elasticsearch::Transport::Transport::Errors::RequestEntityTooLarge`

plaintext

[413] {"Message":"Request size exceeded 10485760 bytes"}

This exception is seen when your Elasticsearch cluster is configured to reject requests above a certain size (10 MiB in this case). This corresponds to the http.max_content_length setting in elasticsearch.yml. Increase it to a larger size and restart your Elasticsearch cluster.

AWS has network limits on the maximum size of HTTP request payloads based on the size of the underlying instance. Set the maximum bulk request size to a value lower than 10 MiB.

Indexing is very slow or fails with `rejected execution of coordinating operation`

Bulk requests getting rejected by the Elasticsearch nodes are likely due to load and lack of available memory. Ensure that your Elasticsearch cluster meets the system requirements and has enough resources to perform bulk operations. See also the error "429 (Too Many Requests)".

Indexing fails with `strict_dynamic_mapping_exception`

Indexing might fail if all advanced search migrations were not finished before doing a major upgrade. A large Sidekiq backlog might accompany this error. To fix the indexing failures, you must reindex the database, repositories, and wikis.

Pause indexing so Sidekiq can catch up:

shell

sudo gitlab-rake gitlab:elastic:pause_indexing

Recreate the index from scratch.

Resume indexing:

shell

sudo gitlab-rake gitlab:elastic:resume_indexing

Indexing keeps pausing with `elasticsearch_pause_indexing setting is enabled`

You might notice that new data is not being detected when you run a search.

This error occurs when that new data is not being indexed properly.

To resolve this error, reindex your data.

However, when reindexing, you might get an error where the indexing process keeps pausing, and the Elasticsearch logs show the following:

shell

"message":"elasticsearch_pause_indexing setting is enabled. Job was added to the waiting queue"

If reindexing does not resolve this issue, and you did not pause the indexing process manually, this error might be happening because two GitLab instances share one Elasticsearch cluster.

To resolve this error, disconnect one of the GitLab instances from using the Elasticsearch cluster.

For more information, see issue 3421.

Search fails with `too_many_clauses: maxClauseCount is set to 1024`

This error occurs when a query has more clauses than defined in the indices.query.bool.max_clause_count setting:

In Elasticsearch 7.17 and earlier, the default value is 1024.
In Elasticsearch 8.0, the default value is 4096.
In Elasticsearch 8.1 and later, the setting is deprecated and the value is dynamically determined.

To resolve this issue, increase the value or upgrade Elasticsearch 8.1 or later. Increasing the value may lead to performance degradation.

Error: `disk usage exceeded flood-stage watermark, index has read-only-allow-delete block`

This error occurs when your Elasticsearch cluster has at least one node that is critically low on disk space. A cluster that exceeds the default watermark threshold of 95% enforces a read-only block that prevents all further write operations. This block might cause new index operations to fail and result in outdated search results.

You can check if the cluster is in read-only mode with the following Rake task:

shell

sudo gitlab-rake gitlab:elastic:info

Look for output that indicates that blocks.write or blocks.read_only_allow_delete is true.

To check disk usage on your Elasticsearch cluster, run the following command:

shell

curl --request GET '<your_ES_cluster>:9200/_cat/allocation?v&pretty'

To resolve this issue, increase your disk volume on full nodes. You can estimate cluster size with the following Rake task:

shell

sudo gitlab-rake gitlab:elastic:estimate_cluster_size

Last resort to recreate an index

There may be cases where somehow data never got indexed and it's not in the queue, or the index is somehow in a state where migrations just cannot proceed. It is always best to try to troubleshoot the root cause of the problem by viewing the logs.

As a last resort, you can recreate the index from scratch. For small GitLab installations, recreating the index can be a quick way to resolve some issues. For large GitLab installations, however, this method might take a very long time. Your index does not show correct search results until the indexing is complete. You might want to clear the Search with advanced search checkbox while the indexing is running.

If you are sure you've read the previous caveats and want to proceed, then you should run the following Rake task to recreate the entire index from scratch.

shell

# WARNING: DO NOT RUN THIS UNTIL YOU READ THE DESCRIPTION ABOVE
sudo gitlab-rake gitlab:elastic:index

shell

# WARNING: DO NOT RUN THIS UNTIL YOU READ THE DESCRIPTION ABOVE
cd /home/git/gitlab
sudo -u git -H bundle exec rake gitlab:elastic:index

Dead queue

Items end up in the dead queue when they fail after being retried once. Dead queue items require manual investigation and are not automatically retried.

Check the status

To check the size and details of the dead queue:

Start the Rails console:
shell
```
sudo gitlab-rails console
```
Check the number of failed items:
ruby
```
Search::Elastic::DeadQueue.queue_size
```
Inspect the details of failed items:
ruby
```
Search::Elastic::DeadQueue.queued_items
```
This command returns a hash where each key is a shard number and each value is an array of [spec, score] pairs. The spec contains information about the failed item.

Retry items

Enqueue the items you want to retry. If these items fail again, they are moved back to the dead queue.

To retry items in the dead queue:

Start the Rails console:
shell
```
sudo gitlab-rails console
```

Move items from the dead queue to the retry queue:

ruby

specs = Search::Elastic::DeadQueue.queued_items.flat_map { |_, items| items.map { |spec, _| spec } }

Search::Elastic::DeadQueue.clear_tracking!
Search::Elastic::RetryQueue.track!(*specs)

Optional. Check indexing status.

To discard items in the dead queue without retrying them, run the following command:

ruby

Search::Elastic::DeadQueue.clear_tracking!

Contact GitLab Support

If you need help with dead queue items, share the following information with GitLab Support:

The output of Search::Elastic::DeadQueue.queue_size
Your Elasticsearch and GitLab versions
When the indexing failures started
Relevant application logs or error messages

Improve Elasticsearch performance

To improve performance, ensure:

The Elasticsearch server is not running on the same node as GitLab.
The Elasticsearch server have enough RAM and CPU cores.
That sharding is being used.

Going into some more detail here, if Elasticsearch is running on the same server as GitLab, resource contention is very likely to occur. Ideally, Elasticsearch, which requires ample resources, should be running on its own server (maybe coupled with Logstash and Kibana).

When it comes to Elasticsearch, RAM is the key resource. Elasticsearch themselves recommend:

At least 8 GB of RAM for a non-production instance.
At least 16 GB of RAM for a production instance.
Ideally, 64 GB of RAM.

For CPU, Elasticsearch recommends at least 2 CPU cores, but Elasticsearch states common setups use up to 8 cores. For more details on server specs, check out the Elasticsearch hardware guide.

Beyond the obvious, sharding comes into play. Sharding is a core part of Elasticsearch. It allows for horizontal scaling of indices, which is helpful when you are dealing with a large amount of data.

With the way GitLab does indexing, there is a huge amount of documents being indexed. By using sharding, you can speed up the ability of Elasticsearch to locate data because each shard is a Lucene index.

If you are not using sharding, you are likely to hit issues when you start using Elasticsearch in a production environment.

An index with only one shard has no scale factor and is likely to encounter issues when called upon with some frequency. See the Elasticsearch documentation on capacity planning.

The easiest way to determine if sharding is in use is to check the output of the Elasticsearch Health API:

Red means the cluster is down.
Yellow means it is up with no sharding/replication.
Green means it is healthy (up, sharding, replicating).

For production use, it should always be green.

Beyond these steps, you get into some of the more complicated things to check, such as merges and caching. These can get complicated and it takes some time to learn them, so it is best to escalate/pair with an Elasticsearch expert if you need to dig further into these.

Reach out to GitLab Support, but this is likely to be something a skilled Elasticsearch administrator has more experience with.

Slow initial indexing

The more data your GitLab instance has, the longer the indexing takes. You can estimate cluster size with the Rake task sudo gitlab-rake gitlab:elastic:estimate_cluster_size.

For code documents

Ensure you have enough Sidekiq nodes and processes to efficiently index code, commits, and wikis. If your initial indexing is slow, consider dedicated Sidekiq nodes or processes.

For non-code documents

If the initial indexing is slow but Sidekiq has enough nodes and processes, you can adjust advanced search worker settings in GitLab. For Requeue indexing workers, the default value is false. For Number of shards for non-code indexing, the default value is 2. These settings limit indexing to 2000 documents per minute.

Prerequisites:

Administrator access.

To adjust worker settings:

In the upper-right corner, select Admin.
Select Settings > Search.
Expand Advanced search.
Select the Requeue indexing workers checkbox.
In the Number of shards for non-code indexing text box, enter a value higher than 2.
Select Save changes.

Create an empty index

Check the status of indexed projects

No search results after updating GitLab

No search results after indexing all repositories

No search results after enabling advanced search with low concurrency

No search results after switching Elasticsearch servers

Indexing fails with error: elastic: Error 429 (Too Many Requests)

Error: Elasticsearch::Transport::Transport::Errors::RequestEntityTooLarge

Indexing is very slow or fails with rejected execution of coordinating operation

Indexing fails with strict_dynamic_mapping_exception

Indexing keeps pausing with elasticsearch_pause_indexing setting is enabled

Search fails with too_many_clauses: maxClauseCount is set to 1024

Error: disk usage exceeded flood-stage watermark, index has read-only-allow-delete block

Last resort to recreate an index

Dead queue

Check the status

Retry items

Contact GitLab Support

Improve Elasticsearch performance

Slow initial indexing

For code documents

For non-code documents

Indexing fails with `error: elastic: Error 429 (Too Many Requests)`

Error: `Elasticsearch::Transport::Transport::Errors::RequestEntityTooLarge`

Indexing is very slow or fails with `rejected execution of coordinating operation`

Indexing fails with `strict_dynamic_mapping_exception`

Indexing keeps pausing with `elasticsearch_pause_indexing setting is enabled`

Search fails with `too_many_clauses: maxClauseCount is set to 1024`

Error: `disk usage exceeded flood-stage watermark, index has read-only-allow-delete block`