Back to Elasticsearch

Semantic text field type reference [semantic-text]

docs/reference/elasticsearch/mapping-reference/semantic-text-reference.md

9.4.112.9 KB
Original Source

Semantic text field type reference [semantic-text]

This page provides reference content for the semantic_text field type, including parameter descriptions, {{infer}} endpoint configuration options, chunking behavior, update operations, querying options, and limitations.

Parameters for semantic_text [semantic-text-params]

The semantic_text field type uses default indexing settings based on the {{infer}} endpoint specified, enabling you to get started without providing additional configuration details. You can override these defaults by customizing the parameters described below.

inference_id : (Optional, string) {{infer-cap}} endpoint that will be used to generate embeddings for the field. For default values, refer to default endpoints. If search_inference_id is specified, the {{infer}} endpoint will only be used at index time. Learn more about configuring this parameter.

Updating the inference_id parameter

:::::{applies-switch}

::::{applies-item} stack: ga 9.3+

You can update this parameter by using the Update mapping API. You can update the {{infer}} endpoint if no values have been indexed or if the new endpoint is compatible with the current one.

:::{important} When updating an inference_id it is important to ensure the new {{infer}} endpoint produces embeddings compatible with those already indexed. This typically means using the same underlying model. :::

::::

::::{applies-item} stack: ga 9.0-9.2 This parameter cannot be updated. ::::

:::::

search_inference_id : (Optional, string) The {{infer}} endpoint that will be used to generate embeddings at query time. Use the Create {{infer}} API to create the endpoint. If not specified, the {{infer}} endpoint defined by inference_id will be used at both index and query time.

You can update this parameter by using

the Update mapping API.

Learn how to [use dedicated endpoints for ingestion and search](./semantic-text-setup-configuration.md#dedicated-endpoints-for-ingestion-and-search).

index_options {applies_to}stack: ga 9.1 : (Optional, object) Specifies the index options to override default values for the field. dense_vector and sparse_vector index options are supported.

:::{note}
This parameter configures vector indexing structures. It is distinct from the [`index_options`](/reference/elasticsearch/mapping-reference/index-options.md) parameter used by term-based fields to control whether term frequencies, positions, and offsets are stored in the inverted index.
:::

chunking_settings {applies_to}stack: ga 9.1 : (Optional, object) Settings for chunking text into smaller passages. If specified, these will override the chunking settings set in the {{infer-cap}} endpoint associated with inference_id.

If chunking settings are updated, they will not be applied to existing documents

until they are reindexed. Defaults to the optimal chunking settings for Elastic Rerank.

To completely disable chunking, use the `none` chunking strategy.

::::{important}
When using the `none` chunking strategy, if the input exceeds the maximum token limit of the underlying model,
some services (such as OpenAI) may return an error. In contrast, the `elastic` and `elasticsearch` services will
automatically truncate the input to fit within the model's limit.
 ::::

Customizing semantic_text indexing

The following example shows how to configure inference_id, index_options and chunking_settings for a semantic_text field type:

console
PUT my-index-000004
{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": "my-text-embedding-endpoint", <1>
        "index_options": { <2>
          "dense_vector": {
            "type": "int4_flat"
          }
        },
        "chunking_settings": { <3>
          "type": "none"
        }
      }
    }
  }
}

% TEST[skip:Requires {{infer}} endpoint]

  1. The inference_id of the {{infer}} endpoint to use for generating embeddings.
  2. Overrides default index options by specifying int4_flat quantization for dense vector embeddings.
  3. Disables automatic chunking by setting the chunking strategy to none.

::::{note} {applies_to}serverless: ga {applies_to}stack: ga 9.1 Newly created indices with semantic_text fields using dense embeddings will be quantized to bbq_hnsw automatically as long as they have a minimum of 64 dimensions.

{applies_to}serverless: ga {applies_to}stack: ga 9.4 Newly created indices with semantic_text fields that use dense embeddings with the float element type will automatically use the bfloat16 element type. This halves the storage required per vector dimension (2 bytes instead of 4) with a negligible impact on search relevance for most use cases. You can override this default by explicitly setting element_type in index_options. ::::

{{infer-cap}} endpoints [configuring-inference-endpoints]

The semantic_text field type specifies an {{infer}} endpoint identifier (inference_id) that is used to generate embeddings.

The following {{infer}} endpoint configurations are available:

If you use a custom {{infer}} endpoint through your ML node and not through Elastic {{infer-cap}} Service (EIS), the recommended method is to use dedicated endpoints for ingestion and search.

{applies_to}stack: ga 9.1.0 If you use EIS, you don't have to set up dedicated endpoints.

::::{warning} Removing an {{infer}} endpoint will cause ingestion of documents and semantic queries to fail on indices that define semantic_text fields with that {{infer}} endpoint as their inference_id. Trying to delete an {{infer}} endpoint that is used on a semantic_text field will result in an error. ::::

Chunking [chunking-behavior]

{{infer-cap}} endpoints have a limit on the amount of text they can process. To allow for large amounts of text to be used in semantic search, semantic_text automatically generates smaller passages if needed, called chunks.

Each chunk refers to a passage of the text and the corresponding embedding generated from it. When querying, the individual passages will be automatically searched for each document, and the most relevant passage will be used to compute a score.

Chunks are stored as start and end character offsets rather than as separate text strings. These offsets point to the exact location of each chunk within the original input text.

You can pre-chunk content by providing text as arrays before indexing.

Refer to the {{infer-cap}} API documentation for values for chunking_settings and to Configuring chunking to learn about different chunking strategies.

Pre-filtering for dense vector queries

{applies_to}
stack: ga 9.3
serverless: ga

When you query semantic_text fields with dense vector embeddings, {{es}} automatically applies filters from Query DSL or ES|QL queries as pre-filters to the vector search. The vector search then finds the most semantically relevant results within the filtered set of documents, ensuring that the number of requested documents is returned.

The following examples in Query DSL and ES|QL syntax demonstrate finding the 10 most relevant documents matching "quick drying t-shirts" while filtering to only green items.

Query DSL example

In Query DSL, must, filter, and must_not queries within the parent bool query are used as pre-filters for semantic_text queries. The term query below will be applied as a pre-filter to the knn search on dense_semantic_text_field.

console
POST my-index/_search
{
  "size" : 10,
  "query" : {
    "bool" : {
      "must" : {
        "match": { <1>
          "dense_semantic_text_field": {
            "query": "quick drying t-shirts"
          }
        }
      },
      "filter" : {
        "term" : {
          "color": {
            "value": "green"
          }
        }
      }
    }
  }
}

% TEST[skip:Requires {{infer}} endpoint]

  1. The match query automatically performs a kNN search on semantic_text fields with dense vector embeddings.

::::{important} When you query a semantic_text field directly with a kNN query in Query DSL, automatic pre-filtering does not apply. The kNN query provides a direct parameter for defining pre-filters as explained in Pre-filters and post-filters. ::::

ES|QL example

The WHERE color == "green" clause will be applied as a pre-filter to the kNN search on dense_semantic_text_field.

console
POST /_query
{
  "query": """
          FROM my-index METADATA _score
          | WHERE MATCH(dense_semantic_text_field, "quick drying t-shirts") <1>
          | WHERE color == "green"
          | SORT _score DESC
          | LIMIT 10
   """
}

% TEST[skip:Requires {{infer}} endpoint]

  1. The {{esql}} MATCH function automatically performs a kNN search on semantic_text fields with dense vector embeddings.

Limitations [limitations]

semantic_text field types have the following limitations:

Document count discrepancy in _cat/indices [document-count-discrepancy]

When an index contains a semantic_text field, the docs.count value returned by the _cat/indices API may be higher than the number of documents you indexed. This occurs because semantic_text stores embeddings in nested documents, one per chunk. The _cat/indices API counts all documents in the Lucene index, including these hidden nested documents.

To count only top-level documents, excluding the nested documents that store embeddings, use one of the following APIs:

  • GET /<index>/_count
  • GET _cat/count/<index>