Scalable Query Best Practices

{{< note >}} If you're using Redis Software or Redis Cloud, see the [best practices for scalable Redis Search]({{< relref "/operate/oss_and_stack/stack-with-enterprise/search/scalable-query-best-practices" >}}) page. {{< /note >}}

Checklist

Below are some basic steps to ensure good performance of Redis Search .

Create a Redis data model with your query patterns in mind.
Ensure the Redis architecture has been sized for the expected load using the sizing calculator.
Provision Redis nodes with sufficient resources (RAM, CPU, network) to support the expected maximum load.
Review [FT.INFO]({{< relref "commands/ft.info" >}}) and [FT.PROFILE]({{< relref "commands/ft.profile" >}}) outputs for anomalies and/or errors.
Conduct load testing in a test environment with real-world queries and a load generated by either memtier_benchmark or a custom load application.

Indexing considerations

General

Favor [TAG]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#tag-fields" >}}) over [NUMERIC]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#numeric-fields" >}}) for use cases that only require matching.
Favor [TAG]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#tag-fields" >}}) over [TEXT]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#text-fields" >}}) for use cases that don’t require full-text capabilities (pure match).

Non-threaded search

Put only those fields used in your queries in the index.
Only make fields [SORTABLE]({{< relref "/develop/ai/search-and-query/advanced-concepts/sorting" >}}) if they are used in [SORTBY]({{< relref "/develop/ai/search-and-query/advanced-concepts/sorting#specifying-sortby" >}}) queries.
Use [DIALECT 2]({{< relref "/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2" >}}).

Threaded (query performance factor or QPF) search

Put both query fields and any projected fields (RETURN or LOAD) in the index.
Set all fields to SORTABLE.
Set TAG fields to [UNF]({{< relref "/develop/ai/search-and-query/advanced-concepts/sorting#normalization-unf-option" >}}).
Optional: Set TEXT fields to NOSTEM if the use case will support it.
Use [DIALECT 2]({{< relref "/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2" >}}).

Query optimization

Avoid returning large result sets. Use CURSOR or LIMIT.
Avoid wildcard searches.
Avoid projecting all fields (e.g., LOAD *). Project only those fields that are part of the index schema.
If queries are long-running, enable threading (query performance factor) to reduce contention for the main Redis thread.

Validate performance (`FT.PROFILE`)

You can analyze [FT.PROFILE]({{< relref "commands/ft.profile" >}}) output to gain insights about query execution. The following informational items are available for analysis:

Total execution time
Execution time per shard
Coordination time (for multi-sharded environments)
Breakdown of the query into fundamental components, such as UNION and INTERSECT
Warnings, such as TIMEOUT

Anti-patterns

When designing and querying indexes in Redis Search, certain practices can hinder performance, scalability, and maintainability. Below are some common anti-patterns to avoid:

Large documents: storing excessively large documents in Redis makes data retrieval slower and increases memory usage. Break data into smaller, focused records whenever possible.
Deeply-nested fields: retrieving or indexing deeply-nested JSON fields is computationally expensive. Use a flatter schema for better performance.
Large result sets: fetching unnecessarily large result sets puts a strain on memory and network resources. Limit results to only what is needed.
Wildcarding: using wildcard patterns indiscriminately in queries can lead to large and inefficient scans, especially if the index size is significant.
Large projections: including excessive fields in query results increases memory overhead and slows down query execution. Limit projections to essential fields.

The following examples depict an anti-pattern index schema and query, followed by corrected versions designed for scalability with Redis Search.

Anti-pattern index schema

The following schema introduces challenges for scalability and performance:

FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: 
          SCHEMA $.tags.* as t NUMERIC SORTABLE 
                 $.firstName as name TEXT 
                 $.location as loc GEO

Issues:

Minimal schema definition: the schema is sparse and lacks fields like lastName, id, and version that might be frequently queried. This results in additional operations to fetch these fields separately, reducing efficiency.
Missing SORTABLE flag for text fields: sorting operations on unsortable fields require full-text processing, which is slow.
Wildcard indexing: $.tags.* creates a broad index that can lead to excessive memory usage and reduced query performance.

Anti-pattern query

The following query is inefficient and not optimized for vertical scaling:

FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' LOAD * LIMIT 0 10

Issues:

Wildcard projection (LOAD *): retrieving all fields in the result set is inefficient and increases memory usage, especially if the documents are large.
Unnecessary fields: fields that aren't required for the current operation are still fetched, slowing down execution.
Lack of advanced query syntax: without specifying a query dialect or leveraging features like tagging, the query may perform unnecessary computations.

Improved index schema

Here’s an optimized schema that adheres to best practices for vertical scaling:

FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: 
          SCHEMA $.tags.* as t NUMERIC SORTABLE 
                 $.firstName as name TEXT NOSTEM SORTABLE 
                 $.lastName as lastname TEXT NOSTEM SORTABLE 
                 $.location as loc GEO SORTABLE 
                 $.id as id TAG SORTABLE UNF 
                 $.ver as ver TAG SORTABLE UNF

Improvements:

NOSTEM for text fields: prevents stemming on fields like firstName and lastName to allow for exact matches (e.g., "Smith" stays "Smith").
Expanded schema: adds commonly queried fields like lastName, id, and version, making queries more efficient by reducing the need for post-query data retrieval.
TAG fields: id and ver are defined as TAG fields to support fast filtering with exact matches.
SORTABLE for all relevant fields: ensures that sorting operations are efficient without requiring full-text scanning.

You might be wondering why $.tags.* as t NUMERIC SORTABLE is acceptable in the improved schema and it wasn't previously. The inclusion of $.tags.* is acceptable when:

It has a clear purpose: it is actively used in queries, such as filtering on numeric ranges or matching specific values.
Other fields in the schema complement it: these fields reduce over-reliance on $.tags.* for all query operations, distributing the load more evenly.
Projections and limits are managed carefully: queries that use $.tags.* should avoid loading unnecessary fields or returning excessively large result sets.

Improved query

The following query is better suited for vertical scaling:

FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' 
                LOAD 6 id t name lastname loc ver 
                LIMIT 0 10
                DIALECT 2