Back to Elasticsearch

Completion

docs/reference/query-languages/esql/_snippets/commands/layout/completion.md

9.4.06.8 KB
Original Source
yaml
serverless: ga
stack: preview 9.1.0, ga 9.3.0

The COMPLETION command allows you to send prompts and context to a Large Language Model (LLM) directly within your ES|QL queries, to perform text generation tasks.

:::::{important} Every row processed by the COMPLETION command generates a separate API call to the LLM endpoint.

::::{applies-switch}

:::{applies-item} stack: ga 9.3+

COMPLETION automatically limits processing to 100 rows by default to prevent accidental high consumption and costs. This limit is applied before the COMPLETION command executes.

If you need to process more rows, you can adjust the limit using the cluster setting:

PUT _cluster/settings
{
  "persistent": {
    "esql.command.completion.limit": 500
  }
}

You can also disable the command entirely if needed:

PUT _cluster/settings
{
  "persistent": {
    "esql.command.completion.enabled": false
  }
}

:::

:::{applies-item} stack: ga 9.1-9.2

Be careful to test with small datasets first before running on production data or in automated workflows, to avoid unexpected costs.

Best practices:

  1. Start with dry runs: Validate your query logic and row counts by running without COMPLETION initially. Use | STATS count = COUNT(*) to check result size.
  2. Filter first: Use WHERE clauses to limit rows before applying COMPLETION.
  3. Test with LIMIT: Always start with a low LIMIT and gradually increase.
  4. Monitor usage: Track your LLM API consumption and costs. :::

:::: :::::

Syntax

::::{applies-switch}

:::{applies-item} stack: ga 9.2+

esql
COMPLETION [column =] prompt WITH { "inference_id" : "my_inference_endpoint" }

:::

:::{applies-item} stack: ga =9.1

esql
COMPLETION [column =] prompt WITH my_inference_endpoint

:::

::::

Parameters

column : (Optional) The name of the output column containing the LLM's response. If not specified, the results will be stored in a column named completion. If the specified column already exists, it will be overwritten with the new results.

prompt : The input text or expression used to prompt the LLM. This can be a string literal or a reference to a column containing text.

my_inference_endpoint : The ID of the inference endpoint to use for the task. The inference endpoint must be configured with the completion task type.

Description

The COMPLETION command provides a general-purpose interface for text generation tasks using a Large Language Model (LLM) in ES|QL.

COMPLETION supports a wide range of text generation tasks. Depending on your prompt and the model you use, you can perform arbitrary text generation tasks including:

  • Question answering
  • Summarization
  • Translation
  • Content rewriting
  • Creative generation

Requirements

To use this command, you must deploy your LLM model in Elasticsearch as an inference endpoint with the task type completion.

Handling timeouts

COMPLETION commands may time out when processing large datasets or complex prompts. The default timeout is 10 minutes, but you can increase this limit if necessary.

How you increase the timeout depends on your deployment type:

::::{applies-switch}

:::{applies-item} ess:

:::{applies-item} self:

  • You can configure at the cluster level by setting search.default_search_timeout in elasticsearch.yml or updating via Cluster Settings API
  • You can also adjust the search:timeout setting using Kibana's Advanced settings
  • Alternatively, you can add timeout parameters to individual queries :::

:::{applies-item} serverless:

  • Requires a manual override from Elastic Support because you cannot modify timeout settings directly :::

::::

If you don't want to increase the timeout limit, try the following:

  • Reduce data volume with LIMIT or more selective filters before the COMPLETION command
  • Split complex operations into multiple simpler queries
  • Configure your HTTP client's response timeout (Refer to HTTP client configuration)

Examples

The following examples show common COMPLETION patterns.

Use the default output column name

If no column name is specified, the response is stored in completion:

esql
ROW question = "What is Elasticsearch?"
| COMPLETION question WITH { "inference_id" : "my_inference_endpoint" }
| KEEP question, completion
question:keywordcompletion:keyword
What is Elasticsearch?A distributed search and analytics engine

Specify the output column name

Use column = to assign the response to a named column:

esql
ROW question = "What is Elasticsearch?"
| COMPLETION answer = question WITH { "inference_id" : "my_inference_endpoint" }
| KEEP question, answer
question:keywordanswer:keyword
What is Elasticsearch?A distributed search and analytics engine

Summarize documents with a prompt

Use CONCAT to build a prompt from field values before calling COMPLETION:

esql
FROM movies
| SORT rating DESC
| LIMIT 10
| EVAL prompt = CONCAT(
   "Summarize this movie using the following information: \n",
   "Title: ", title, "\n",
   "Synopsis: ", synopsis, "\n",
   "Actors: ", MV_CONCAT(actors, ", "), "\n",
  )
| COMPLETION summary = prompt WITH { "inference_id" : "my_inference_endpoint" }
| KEEP title, summary, rating
title:keywordsummary:keywordrating:double
The Shawshank RedemptionA tale of hope and redemption in prison.9.3
The GodfatherA mafia family's rise and fall.9.2
The Dark KnightBatman battles the Joker in Gotham.9.0
Pulp FictionInterconnected crime stories with dark humor.8.9
Fight ClubA man starts an underground fight club.8.8
InceptionA thief steals secrets through dreams.8.8
The MatrixA hacker discovers reality is a simulation.8.7
ParasiteClass conflict between two families.8.6
InterstellarA team explores space to save humanity.8.6
The PrestigeRival magicians engage in dangerous competition.8.5