Asymmetric Embedding Configuration

LightRAG keeps embedding behavior symmetric by default. Query/document asymmetric embedding is enabled only when EMBEDDING_ASYMMETRIC=true is explicitly set.

This avoids accidental retrieval changes when prefix variables are present in an environment but the user did not intentionally enable asymmetric embeddings.

Before enabling asymmetric embeddings for any model, check the model's current model card or provider documentation. Do not infer the right behavior from the API binding alone: an openai-compatible endpoint can serve instruction-free models, prefix-based models, or provider-specific models behind the same API shape.

Reindexing Requirement

Changing asymmetric embedding settings changes the vectors produced for stored documents and for future queries. After enabling, disabling, or changing any of these settings, clear the existing LightRAG data for the workspace and re-index the source files:

EMBEDDING_ASYMMETRIC
EMBEDDING_QUERY_PREFIX
EMBEDDING_DOCUMENT_PREFIX
Provider task behavior such as Jina task, Gemini task_type, or VoyageAI input_type

Do not reuse an existing vector store across asymmetric embedding configuration changes. Mixing vectors generated with different query/document behavior can make retrieval quality unpredictable.

Binding Types

LightRAG distinguishes two asymmetric embedding styles:

Style	Bindings	How asymmetric behavior is applied
Provider task parameters	`jina`, `gemini`, `voyageai`	LightRAG passes query/document context to the provider-specific `task`, `task_type`, or `input_type` parameter.
Text task prefixes	`openai`, `azure_openai`, `ollama`	LightRAG prepends configured text prefixes before calling the embedding API. Use this only when the model card explicitly requires separate query/document prefixes.

Other server embedding bindings do not currently support EMBEDDING_ASYMMETRIC=true.

Default: Symmetric Embeddings

When EMBEDDING_ASYMMETRIC is unset, LightRAG does not enable asymmetric embedding behavior, even if prefix variables exist:

env

# EMBEDDING_ASYMMETRIC is unset
# EMBEDDING_QUERY_PREFIX="search_query: "
# EMBEDDING_DOCUMENT_PREFIX="search_document: "

The prefixes are ignored and a warning is logged.

The same is true when the flag is explicitly false:

env

EMBEDDING_ASYMMETRIC=false

Instruction-Free Models: Keep Symmetric

Some embedding models are instruction-free, sometimes described as using implicit intent. They are trained to handle query/document matching from the raw text itself and do not require query/document prefixes or provider task parameters. For these models, do not set EMBEDDING_ASYMMETRIC=true; leave it unset or set it to false, and do not configure EMBEDDING_QUERY_PREFIX or EMBEDDING_DOCUMENT_PREFIX.

Common examples that should normally stay in symmetric mode:

Model family	Example model IDs	Notes
BGE-M3	`BAAI/bge-m3`	Use plain text input. Do not add `search_query:` / `search_document:` unless the specific serving wrapper's model card says otherwise.
OpenAI Text Embedding 3	`text-embedding-3-small`, `text-embedding-3-large`	The OpenAI embeddings API uses text input plus the model name; it does not expose a query/document task parameter.
Mistral Embed	`mistral-embed`	Use the provider's plain embedding input. Do not invent task prefixes.
Alibaba GTE base models	`gte-large`, `gte-large-zh`	Base GTE models use plain text for normal retrieval. This does not apply to newer `instruct` variants such as `gte-Qwen2-1.5B-instruct`; check that model card.
Jina Embeddings v2	`jina-embeddings-v2-base-en`, `jina-embeddings-v2-base-zh`	Jina v2 is plain-text input. Jina v3/v4 are different and use the `task` parameter for retrieval tasks.

If a model is instruction-free, enabling LightRAG's asymmetric mode can make the input different from what the model was trained or documented to expect. That can reduce retrieval quality even though the server starts successfully.

Provider Task Parameter Bindings

Use this mode for providers that expose separate query/document embedding tasks. Do not configure prefix variables for these bindings.

Jina example:

env

EMBEDDING_BINDING=jina
EMBEDDING_ASYMMETRIC=true
EMBEDDING_MODEL=jina-embeddings-v4

Gemini example:

env

EMBEDDING_BINDING=gemini
EMBEDDING_ASYMMETRIC=true
EMBEDDING_MODEL=gemini-embedding-001

VoyageAI example:

env

EMBEDDING_BINDING=voyageai
EMBEDDING_ASYMMETRIC=true
EMBEDDING_MODEL=voyage-3

If EMBEDDING_QUERY_PREFIX or EMBEDDING_DOCUMENT_PREFIX is also configured for these bindings, LightRAG logs a warning and ignores the prefixes.

Text Task Prefix Bindings

Use this mode for embedding models that expect task instructions in the input text, such as models whose card documents prefixes like search_query:, search_document:, query:, or passage:. Do not enable this mode just because the model is served through openai, azure_openai, or ollama.

Both prefix variables must be explicitly configured:

env

EMBEDDING_ASYMMETRIC=true
EMBEDDING_QUERY_PREFIX="search_query: "
EMBEDDING_DOCUMENT_PREFIX="search_document: "

If one side should intentionally have no prefix, use the sentinel NO_PREFIX:

env

EMBEDDING_ASYMMETRIC=true
EMBEDDING_QUERY_PREFIX="search_query: "
EMBEDDING_DOCUMENT_PREFIX=NO_PREFIX

NO_PREFIX is converted to an empty string internally. It is different from an unset variable: it means the side was reviewed and intentionally left without a prefix.

At least one side must have a non-empty prefix. This is invalid:

env

EMBEDDING_ASYMMETRIC=true
EMBEDDING_QUERY_PREFIX=NO_PREFIX
EMBEDDING_DOCUMENT_PREFIX=NO_PREFIX

Invalid Empty Prefixes

Do not use an empty environment value for an intentional empty prefix:

env

EMBEDDING_DOCUMENT_PREFIX=

Use NO_PREFIX instead. Empty values are rejected because shell, .env, and Docker Compose handling can make empty strings indistinguishable from accidental missing configuration.

Validation Summary

Configuration	Result
`EMBEDDING_ASYMMETRIC` unset	Symmetric mode; prefixes ignored with a warning.
`EMBEDDING_ASYMMETRIC=false`	Symmetric mode; prefixes ignored with a warning.
Instruction-free model such as `BAAI/bge-m3`, `text-embedding-3-small`, `mistral-embed`, base GTE, or Jina v2	Keep symmetric mode; do not configure prefixes or provider tasks unless the model card says to.
`EMBEDDING_ASYMMETRIC=true` with `jina`/`gemini`/`voyageai`	Provider task mode; prefixes ignored with a warning.
`EMBEDDING_ASYMMETRIC=true` with `openai`/`azure_openai`/`ollama` and both prefix variables configured	Prefix mode.
Prefix mode with a missing prefix variable	Startup error; use a real prefix or `NO_PREFIX`.
Prefix mode with both sides `NO_PREFIX`	Startup error; no asymmetric behavior would occur.
Prefix variable set to an empty value	Startup error; use `NO_PREFIX`.

Any valid change from one asymmetric embedding configuration to another still requires clearing the workspace data and re-indexing the source files.