Back to Redis

Scalable Query Best Practices

content/develop/ai/search-and-query/best-practices/scalable-query-best-practices.md

latest8.0 KB
Original Source

{{< note >}} If you're using Redis Software or Redis Cloud, see the [best practices for scalable Redis Search]({{< relref "/operate/oss_and_stack/stack-with-enterprise/search/scalable-query-best-practices" >}}) page. {{< /note >}}

Checklist

Below are some basic steps to ensure good performance of Redis Search .

  • Create a Redis data model with your query patterns in mind.
  • Ensure the Redis architecture has been sized for the expected load using the sizing calculator.
  • Provision Redis nodes with sufficient resources (RAM, CPU, network) to support the expected maximum load.
  • Review [FT.INFO]({{< relref "commands/ft.info" >}}) and [FT.PROFILE]({{< relref "commands/ft.profile" >}}) outputs for anomalies and/or errors.
  • Conduct load testing in a test environment with real-world queries and a load generated by either memtier_benchmark or a custom load application.

Indexing considerations

General

  • Favor [TAG]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#tag-fields" >}}) over [NUMERIC]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#numeric-fields" >}}) for use cases that only require matching.
  • Favor [TAG]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#tag-fields" >}}) over [TEXT]({{< relref "/develop/ai/search-and-query/indexing/field-and-type-options#text-fields" >}}) for use cases that don’t require full-text capabilities (pure match).
  • Put only those fields used in your queries in the index.
  • Only make fields [SORTABLE]({{< relref "/develop/ai/search-and-query/advanced-concepts/sorting" >}}) if they are used in [SORTBY]({{< relref "/develop/ai/search-and-query/advanced-concepts/sorting#specifying-sortby" >}}) queries.
  • Use [DIALECT 2]({{< relref "/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2" >}}).
  • Put both query fields and any projected fields (RETURN or LOAD) in the index.
  • Set all fields to SORTABLE.
  • Set TAG fields to [UNF]({{< relref "/develop/ai/search-and-query/advanced-concepts/sorting#normalization-unf-option" >}}).
  • Optional: Set TEXT fields to NOSTEM if the use case will support it.
  • Use [DIALECT 2]({{< relref "/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2" >}}).

Query optimization

  • Avoid returning large result sets. Use CURSOR or LIMIT.
  • Avoid wildcard searches.
  • Avoid projecting all fields (e.g., LOAD *). Project only those fields that are part of the index schema.
  • If queries are long-running, enable threading (query performance factor) to reduce contention for the main Redis thread.

Validate performance (FT.PROFILE)

You can analyze [FT.PROFILE]({{< relref "commands/ft.profile" >}}) output to gain insights about query execution. The following informational items are available for analysis:

  • Total execution time
  • Execution time per shard
  • Coordination time (for multi-sharded environments)
  • Breakdown of the query into fundamental components, such as UNION and INTERSECT
  • Warnings, such as TIMEOUT

Anti-patterns

When designing and querying indexes in Redis Search, certain practices can hinder performance, scalability, and maintainability. Below are some common anti-patterns to avoid:

  • Large documents: storing excessively large documents in Redis makes data retrieval slower and increases memory usage. Break data into smaller, focused records whenever possible.
  • Deeply-nested fields: retrieving or indexing deeply-nested JSON fields is computationally expensive. Use a flatter schema for better performance.
  • Large result sets: fetching unnecessarily large result sets puts a strain on memory and network resources. Limit results to only what is needed.
  • Wildcarding: using wildcard patterns indiscriminately in queries can lead to large and inefficient scans, especially if the index size is significant.
  • Large projections: including excessive fields in query results increases memory overhead and slows down query execution. Limit projections to essential fields.

The following examples depict an anti-pattern index schema and query, followed by corrected versions designed for scalability with Redis Search.

Anti-pattern index schema

The following schema introduces challenges for scalability and performance:

sh
FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: 
          SCHEMA $.tags.* as t NUMERIC SORTABLE 
                 $.firstName as name TEXT 
                 $.location as loc GEO

Issues:

  • Minimal schema definition: the schema is sparse and lacks fields like lastName, id, and version that might be frequently queried. This results in additional operations to fetch these fields separately, reducing efficiency.
  • Missing SORTABLE flag for text fields: sorting operations on unsortable fields require full-text processing, which is slow.
  • Wildcard indexing: $.tags.* creates a broad index that can lead to excessive memory usage and reduced query performance.

Anti-pattern query

The following query is inefficient and not optimized for vertical scaling:

sh
FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' LOAD * LIMIT 0 10

Issues:

  • Wildcard projection (LOAD *): retrieving all fields in the result set is inefficient and increases memory usage, especially if the documents are large.
  • Unnecessary fields: fields that aren't required for the current operation are still fetched, slowing down execution.
  • Lack of advanced query syntax: without specifying a query dialect or leveraging features like tagging, the query may perform unnecessary computations.

Improved index schema

Here’s an optimized schema that adheres to best practices for vertical scaling:

sh
FT.CREATE jsonidx:profiles ON JSON PREFIX 1 profiles: 
          SCHEMA $.tags.* as t NUMERIC SORTABLE 
                 $.firstName as name TEXT NOSTEM SORTABLE 
                 $.lastName as lastname TEXT NOSTEM SORTABLE 
                 $.location as loc GEO SORTABLE 
                 $.id as id TAG SORTABLE UNF 
                 $.ver as ver TAG SORTABLE UNF

Improvements:

  • NOSTEM for text fields: prevents stemming on fields like firstName and lastName to allow for exact matches (e.g., "Smith" stays "Smith").
  • Expanded schema: adds commonly queried fields like lastName, id, and version, making queries more efficient by reducing the need for post-query data retrieval.
  • TAG fields: id and ver are defined as TAG fields to support fast filtering with exact matches.
  • SORTABLE for all relevant fields: ensures that sorting operations are efficient without requiring full-text scanning.

You might be wondering why $.tags.* as t NUMERIC SORTABLE is acceptable in the improved schema and it wasn't previously. The inclusion of $.tags.* is acceptable when:

  • It has a clear purpose: it is actively used in queries, such as filtering on numeric ranges or matching specific values.
  • Other fields in the schema complement it: these fields reduce over-reliance on $.tags.* for all query operations, distributing the load more evenly.
  • Projections and limits are managed carefully: queries that use $.tags.* should avoid loading unnecessary fields or returning excessively large result sets.

Improved query

The following query is better suited for vertical scaling:

sh
FT.AGGREGATE jsonidx:profiles '@t:[1299 1299]' 
                LOAD 6 id t name lastname loc ver 
                LIMIT 0 10
                DIALECT 2

Improvements:

  • Targeted projection: the LOAD clause specifies only essential fields (id, t, name, lastname, loc, ver), reducing memory and network overhead.
  • Limited results: the LIMIT clause ensures the query retrieves only the first 10 results, avoiding large result sets.
  • [DIALECT 2]({{< relref "/develop/ai/search-and-query/advanced-concepts/dialects#dialect-2" >}}): enables the latest Redis Search syntax and features, ensuring compatibility with modern capabilities.