content/develop/ai/redisvl/0.11.0/api/schema.md
Schema in RedisVL provides a structured format to define index settings and field configurations using the following three components:
| Component | Description |
|---|---|
| version | The version of the schema spec. Current supported version is 0.1.0. |
| index | Index specific settings like name, key prefix, key separator, and storage type. |
| fields | Subset of fields within your data to include in the index and any custom settings. |
<a id="indexschema-api"></a>
class IndexSchema(*, index, fields=<factory>, version='0.1.0')A schema definition for a search index in Redis, used in RedisVL for configuring index settings and organizing vector and metadata fields.
The class offers methods to create an index schema from a YAML file or a Python dictionary, supporting flexible schema definitions and easy integration into various workflows.
An example schema.yaml file might look like this:
version: '0.1.0'
index:
name: user-index
prefix: user
key_separator: ":"
storage_type: json
fields:
- name: user
type: tag
- name: credit_score
type: tag
- name: embedding
type: vector
attrs:
algorithm: flat
dims: 3
distance_metric: cosine
datatype: float32
Loading the schema for RedisVL from yaml is as simple as:
from redisvl.schema import IndexSchema
schema = IndexSchema.from_yaml("schema.yaml")
Loading the schema for RedisVL from dict is as simple as:
from redisvl.schema import IndexSchema
schema = IndexSchema.from_dict({
"index": {
"name": "user-index",
"prefix": "user",
"key_separator": ":",
"storage_type": "json",
},
"fields": [
{"name": "user", "type": "tag"},
{"name": "credit_score", "type": "tag"},
{
"name": "embedding",
"type": "vector",
"attrs": {
"algorithm": "flat",
"dims": 3,
"distance_metric": "cosine",
"datatype": "float32"
}
}
]
})
NOTEThe fields attribute in the schema must contain unique field names to ensure correct and unambiguous field references.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
add_field(field_inputs)Adds a single field to the index schema based on the specified field type and attributes.
This method allows for the addition of individual fields to the schema, providing flexibility in defining the structure of the index.
# Add a tag field
schema.add_field({"name": "user", "type": "tag})
# Add a vector field
schema.add_field({
"name": "user-embedding",
"type": "vector",
"attrs": {
"dims": 1024,
"algorithm": "flat",
"datatype": "float32"
}
})
add_fields(fields)Extends the schema with additional fields.
This method allows dynamically adding new fields to the index schema. It processes a list of field definitions.
schema.add_fields([
{"name": "user", "type": "tag"},
{"name": "bio", "type": "text"},
{
"name": "user-embedding",
"type": "vector",
"attrs": {
"dims": 1024,
"algorithm": "flat",
"datatype": "float32"
}
}
])
classmethod from_dict(data)Create an IndexSchema from a dictionary.
from redisvl.schema import IndexSchema
schema = IndexSchema.from_dict({
"index": {
"name": "docs-index",
"prefix": "docs",
"storage_type": "hash",
},
"fields": [
{
"name": "doc-id",
"type": "tag"
},
{
"name": "doc-embedding",
"type": "vector",
"attrs": {
"algorithm": "flat",
"dims": 1536
}
}
]
})
classmethod from_yaml(file_path)Create an IndexSchema from a YAML file.
from redisvl.schema import IndexSchema
schema = IndexSchema.from_yaml("schema.yaml")
remove_field(field_name)Removes a field from the schema based on the specified name.
This method is useful for dynamically altering the schema by removing existing fields.
to_dict()Serialize the index schema model to a dictionary, handling Enums and other special cases properly.
to_yaml(file_path, overwrite=True)Write the index schema to a YAML file.
property field_names: List[str]A list of field names associated with the index schema.
fields: Dict[str, BaseField]Fields associated with the search index and their properties.
Note: When creating from dict/YAML, provide fields as a list of field definitions. The validator will convert them to a Dict[str, BaseField] internally.
index: IndexInfoDetails of the basic index configurations.
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
version: Literal['0.1.0']Version of the underlying index schema.
Fields in the schema can be defined in YAML format or as a Python dictionary, specifying a name, type, an optional path, and attributes for customization.
YAML Example:
- name: title
type: text
path: $.document.title
attrs:
weight: 1.0
no_stem: false
withsuffixtrie: true
Python Dictionary Example:
{
"name": "location",
"type": "geo",
"attrs": {
"sortable": true
}
}
RedisVL supports several basic field types for indexing different kinds of data. Each field type has specific attributes that customize its indexing and search behavior.
Text FieldsText fields support full-text search with stemming, phonetic matching, and other text analysis features.
class TextField(*, name, type=FieldTypes.TEXT, path=None, attrs=<factory>)Bases: BaseField
Text field supporting a full text search index
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: TextFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal[FieldTypes.TEXT]Field type
class TextFieldAttributes(*, sortable=False, index_missing=False, no_index=False, weight=1, no_stem=False, withsuffixtrie=False, phonetic_matcher=None, index_empty=False, unf=False)Full text field attributes
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
index_empty: boolAllow indexing and searching for empty strings
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
no_stem: boolDisable stemming on the text field during indexing
phonetic_matcher: str | NoneUsed to perform phonetic matching during search
unf: boolUn-normalized form - disable normalization on sortable fields (only applies when sortable=True)
weight: floatDeclares the importance of this field when calculating results
withsuffixtrie: boolKeep a suffix trie with all terms which match the suffix to optimize certain queries
Tag FieldsTag fields are optimized for exact-match filtering and faceted search on categorical data.
class TagField(*, name, type=FieldTypes.TAG, path=None, attrs=<factory>)Bases: BaseField
Tag field for simple boolean-style filtering
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: TagFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal[FieldTypes.TAG]Field type
class TagFieldAttributes(*, sortable=False, index_missing=False, no_index=False, separator=',', case_sensitive=False, withsuffixtrie=False, index_empty=False)Tag field attributes
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
case_sensitive: boolTreat text as case sensitive or not. By default, tag characters are converted to lowercase
index_empty: boolAllow indexing and searching for empty strings
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
separator: strIndicates how the text in the original attribute is split into individual tags
withsuffixtrie: boolKeep a suffix trie with all terms which match the suffix to optimize certain queries
Numeric FieldsNumeric fields support range queries and sorting on numeric data.
class NumericField(*, name, type=FieldTypes.NUMERIC, path=None, attrs=<factory>)Bases: BaseField
Numeric field for numeric range filtering
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: NumericFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal[FieldTypes.NUMERIC]Field type
class NumericFieldAttributes(*, sortable=False, index_missing=False, no_index=False, unf=False)Numeric field attributes
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
unf: boolUn-normalized form - disable normalization on sortable fields (only applies when sortable=True)
Geo FieldsGeo fields enable location-based search with geographic coordinates.
class GeoField(*, name, type=FieldTypes.GEO, path=None, attrs=<factory>)Bases: BaseField
Geo field with a geo-spatial index for location based search
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: GeoFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal[FieldTypes.GEO]Field type
class GeoFieldAttributes(*, sortable=False, index_missing=False, no_index=False)Numeric field attributes
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Vector fields enable semantic similarity search using various algorithms. All vector fields share common attributes but have algorithm-specific configurations.
Common Vector AttributesAll vector field types share these base attributes:
class BaseVectorFieldAttributes(*, dims, algorithm, datatype=VectorDataType.FLOAT32, distance_metric=VectorDistanceMetric.COSINE, initial_cap=None, index_missing=False)Base vector field attributes shared by FLAT, HNSW, and SVS-VAMANA fields
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
classmethod uppercase_strings(v)Validate that provided values are cast to uppercase
algorithm: VectorIndexAlgorithmFLAT, HNSW, or SVS-VAMANA
datatype: VectorDataTypeThe float datatype for the vector embeddings
dims: intDimensionality of the vector embeddings field
distance_metric: VectorDistanceMetricThe distance metric used to measure query relevance
property field_data: Dict[str, Any]Select attributes required by the Redis API
index_missing: boolAllow indexing and searching for missing values (documents without the field)
initial_cap: int | NoneInitial vector capacity in the index affecting memory allocation size of the index
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
Key Attributes:
dims: Dimensionality of the vector (e.g., 768, 1536).
algorithm: Indexing algorithm for vector search:
For detailed algorithm comparison and selection guidance, see Vector Algorithm Comparison.
datatype: Float precision (bfloat16, float16, float32, float64). Note: SVS-VAMANA only supports float16 and float32.
distance_metric: Similarity metric (COSINE, L2, IP).
initial_cap: Initial capacity hint for memory allocation (optional).
index_missing: When True, allows searching for documents missing this field (optional).
HNSW Vector FieldsHNSW (Hierarchical Navigable Small World) - Graph-based approximate search with excellent recall. Best for general-purpose vector search (10K-1M+ vectors).
When to use HNSW & Performance DetailsUse HNSW when:
Performance characteristics:
class HNSWVectorField(*, name, type='vector', path=None, attrs)Bases: BaseField
Vector field with HNSW (Hierarchical Navigable Small World) indexing for approximate nearest neighbor search.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: HNSWVectorFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal['vector']Field type
class HNSWVectorFieldAttributes(*, dims, algorithm=VectorIndexAlgorithm.HNSW, datatype=VectorDataType.FLOAT32, distance_metric=VectorDistanceMetric.COSINE, initial_cap=None, index_missing=False, m=16, ef_construction=200, ef_runtime=10, epsilon=0.01)HNSW vector field attributes for approximate nearest neighbor search.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
algorithm: Literal[VectorIndexAlgorithm.HNSW]The indexing algorithm (fixed as ‘hnsw’)
ef_construction: int100-800)
ef_runtime: intepsilon: float0.01)
m: int8-64)
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
HNSW Examples:
Balanced configuration (recommended starting point):
- name: embedding
type: vector
attrs:
algorithm: hnsw
dims: 768
distance_metric: cosine
datatype: float32
# Balanced settings for good recall and performance
m: 16
ef_construction: 200
ef_runtime: 10
High-recall configuration:
- name: embedding
type: vector
attrs:
algorithm: hnsw
dims: 768
distance_metric: cosine
datatype: float32
# Tuned for maximum accuracy
m: 32
ef_construction: 400
ef_runtime: 50
SVS-VAMANA Vector FieldsSVS-VAMANA (Scalable Vector Search with VAMANA graph algorithm) provides fast approximate nearest neighbor search with optional compression support. This algorithm is optimized for Intel hardware and offers reduced memory usage through vector compression. Best for large datasets (>100K vectors) on Intel hardware with memory constraints.
When to use SVS-VAMANA & Detailed GuideRequirements: : - Redis >= 8.2.0 with RediSearch >= 2.8.10
Use SVS-VAMANA when: : - Large datasets where memory is expensive
Performance vs other algorithms: : - vs FLAT: Much faster search, significantly lower memory usage with compression, but approximate results
Compression selection guide:
Memory Savings Examples (1M vectors, 768 dims): : - No compression (float32): 3.1 GB
class SVSVectorField(*, name, type=FieldTypes.VECTOR, path=None, attrs)Bases: BaseField
Vector field with SVS-VAMANA indexing and compression for memory-efficient approximate nearest neighbor search.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: SVSVectorFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal[FieldTypes.VECTOR]Field type
class SVSVectorFieldAttributes(*, dims, algorithm=VectorIndexAlgorithm.SVS_VAMANA, datatype=VectorDataType.FLOAT32, distance_metric=VectorDistanceMetric.COSINE, initial_cap=None, index_missing=False, graph_max_degree=40, construction_window_size=250, search_window_size=20, epsilon=0.01, compression=None, reduce=None, training_threshold=None)SVS-VAMANA vector field attributes with compression support.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
validate_svs_params()Validate SVS-VAMANA specific constraints
algorithm: Literal[VectorIndexAlgorithm.SVS_VAMANA]The indexing algorithm for the vector field
compression: CompressionType | NoneLVQ4, LVQ8, LeanVec4x8, LeanVec8x8
construction_window_size: intepsilon: float0.01)
graph_max_degree: intmodel_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
reduce: int | NoneDimensionality reduction for LeanVec types (must be < dims)
search_window_size: inttraining_threshold: int | None10,240)
SVS-VAMANA Examples:
Basic configuration (no compression):
- name: embedding
type: vector
attrs:
algorithm: svs-vamana
dims: 768
distance_metric: cosine
datatype: float32
# Standard settings for balanced performance
graph_max_degree: 40
construction_window_size: 250
search_window_size: 20
High-performance configuration with compression:
- name: embedding
type: vector
attrs:
algorithm: svs-vamana
dims: 768
distance_metric: cosine
datatype: float32
# Tuned for better recall
graph_max_degree: 64
construction_window_size: 500
search_window_size: 40
# Maximum compression with dimensionality reduction
compression: LeanVec4x8
reduce: 384 # 50% dimensionality reduction
training_threshold: 1000
Important Notes:
FLAT Vector FieldsFLAT - Brute-force exact search. Best for small datasets (<10K vectors) requiring 100% accuracy.
When to use FLAT & Performance DetailsUse FLAT when: : - Small datasets (<100K vectors) where exact results are required
Performance characteristics: : - Search accuracy: 100% exact results (no approximation)
Trade-offs vs other algorithms: : - vs HNSW: Much slower search but exact results, faster index building
class FlatVectorField(*, name, type=FieldTypes.VECTOR, path=None, attrs)Bases: BaseField
Vector field with FLAT (exact search) indexing for exact nearest neighbor search.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
as_redis_field()Convert schema field to Redis Field object
attrs: FlatVectorFieldAttributesSpecified field attributes
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
type: Literal[FieldTypes.VECTOR]Field type
class FlatVectorFieldAttributes(*, dims, algorithm=VectorIndexAlgorithm.FLAT, datatype=VectorDataType.FLOAT32, distance_metric=VectorDistanceMetric.COSINE, initial_cap=None, index_missing=False, block_size=None)FLAT vector field attributes for exact nearest neighbor search.
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
algorithm: Literal[VectorIndexAlgorithm.FLAT]The indexing algorithm (fixed as ‘flat’)
block_size: int | NoneBlock size for processing (optional) - improves batch operation throughput
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
FLAT Example:
- name: embedding
type: vector
attrs:
algorithm: flat
dims: 768
distance_metric: cosine
datatype: float32
# Optional: tune for batch processing
block_size: 1024
Note: FLAT is recommended for small datasets or when exact results are mandatory. For larger datasets, consider HNSW or SVS-VAMANA for better performance.
For SVS-VAMANA indices, RedisVL provides utilities to help configure compression settings and estimate memory savings.
CompressionAdvisorclass CompressionAdvisorBases: object
Helper to recommend compression settings based on vector characteristics.
This class provides utilities to:
Examples>> # Get recommendations for high-dimensional vectors
>> config = CompressionAdvisor.recommend(dims=1536, priority="balanced")
>> config.compression
'LeanVec4x8'
>> config.reduce
768
>> # Estimate memory savings
>> savings = CompressionAdvisor.estimate_memory_savings(
... compression="LeanVec4x8",
... dims=1536,
... reduce=768
... )
>> savings
81.2
static estimate_memory_savings(compression, dims, reduce=None)Estimate memory savings percentage from compression.
Calculates the percentage of memory saved compared to uncompressed float32 vectors.
Examples>> # LeanVec with dimensionality reduction
>> CompressionAdvisor.estimate_memory_savings(
... compression="LeanVec4x8",
... dims=1536,
... reduce=768
... )
81.2
>> # LVQ without dimensionality reduction
>> CompressionAdvisor.estimate_memory_savings(
... compression="LVQ4",
... dims=384
... )
87.5
static recommend(dims, priority='balanced', datatype=None)Recommend compression settings based on dimensions and priorities.
Examples>> # High-dimensional embeddings (e.g., OpenAI ada-002)
>> config = CompressionAdvisor.recommend(dims=1536, priority="memory")
>> config.compression
'LeanVec4x8'
>> config.reduce
768
>> # Lower-dimensional embeddings
>> config = CompressionAdvisor.recommend(dims=384, priority="speed")
>> config.compression
'LVQ4x8'
SVSConfigclass SVSConfig(*, algorithm='svs-vamana', datatype=None, compression=None, reduce=None, graph_max_degree=None, construction_window_size=None, search_window_size=None)Bases: BaseModel
SVS-VAMANA configuration model.
algorithmAlways "svs-vamana"
datatypeVector datatype (float16, float32)
compressionCompression type (LVQ4, LeanVec4x8, etc.)
reduceReduced dimensionality (only for LeanVec)
graph_max_degreeMax edges per node
construction_window_sizeBuild-time candidates
search_window_sizeQuery-time candidates
Create a new model by parsing and validating input data from keyword arguments.
Raises [ValidationError][pydantic_core.ValidationError] if the input data cannot be validated to form a valid model.
self is explicitly positional-only to allow self as a field name.
model_config: ClassVar[ConfigDict] = {}Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].
<a id="vector-algorithm-comparison"></a>
This section provides detailed guidance for choosing between vector search algorithms.
Algorithm Selection GuideVector Algorithm Comparison| Algorithm | Best For | Performance | Memory Usage | Trade-offs |
|---|---|---|---|---|
| FLAT | Small datasets (<100K vectors) | 100% recall, O(n) search | Minimal overhead | Exact but slow for large data |
| HNSW | General purpose (100K-1M+ vectors) | 95-99% recall, O(log n) search | Moderate (graph overhead) | Fast approximate search |
| SVS-VAMANA | Large datasets with memory constraints | 90-95% recall, O(log n) search | Low (with compression) | Intel-optimized, requires newer Redis |
When to Use Each AlgorithmChoose FLAT when: : - Dataset size < 100,000 vectors
Choose HNSW when: : - Dataset size 100K - 1M+ vectors
Choose SVS-VAMANA when: : - Dataset size > 100K vectors
Performance CharacteristicsSearch Speed: : - FLAT: Linear time O(n) - gets slower as data grows
Memory Usage (1M vectors, 768 dims, float32): : - FLAT: ~3.1 GB (baseline)
Recall Quality: : - FLAT: 100% (exact search)
Migration ConsiderationsFrom FLAT to HNSW: : - Straightforward migration
From HNSW to SVS-VAMANA: : - Requires Redis >= 8.2 with RediSearch >= 2.8.10
From SVS-VAMANA to others: : - May need to change datatype back if using float64/bfloat16
For complete Redis field documentation, see: https://redis.io/commands/ft.create/