manual/english/Searching/Conversational_search.md
Conversational search lets Manticore Buddy answer questions over an existing vectorized table. Buddy retrieves the most relevant rows with KNN search, turns those rows into context, and sends the context plus the conversation history to an LLM.
It is managed from SQL with:
CREATE CHAT MODELSHOW CHAT MODELSDESCRIBE CHAT MODELDROP CHAT MODELCALL CHATYou need a vectorized table and an LLM provider. The table requirements are covered below. Provider credentials can be set in CREATE CHAT MODEL with api_key, or supplied through the matching environment variable, such as OPENAI_API_KEY.
When CALL CHAT runs, Buddy builds a retrieval-augmented answer in this order:
FLOAT_VECTOR field.from='...' source fields.The fifth argument of CALL CHAT is called fields internally, but for conversational search it means the vector field used by knn(...). It is not a list of fields to return. Buddy selects rows with SELECT *, then removes vector columns from the sources payload so the response does not include large embedding values.
The table must have at least one FLOAT_VECTOR field configured for auto embeddings. The vector field must include from='...', because Buddy uses those source fields as LLM context.
The examples below use onnx-models/all-MiniLM-L12-v2-onnx, which runs through the recommended ONNX path and does not require an embedding API key.
CREATE TABLE docs (
id BIGINT,
title TEXT,
content TEXT,
embedding FLOAT_VECTOR
knn_type='hnsw'
hnsw_similarity='cosine'
model_name='onnx-models/all-MiniLM-L12-v2-onnx'
from='title,content'
) TYPE='rt';
INSERT INTO docs(id, title, content) VALUES
(1, 'Vector search', 'Vector search compares embeddings to find semantically similar documents.'),
(2, 'Full-text search', 'Full-text search matches terms and phrases in indexed text.');
If CALL CHAT does not specify a vector field, Buddy uses the first FLOAT_VECTOR field found in the table definition.
Use CREATE CHAT MODEL to store the LLM provider, model id, and retrieval settings.
CREATE CHAT MODEL assistant (
model='openai:gpt-4o-mini'
);
You can also set provider options and retrieval limits:
<!-- example conversational_search_create_model_extended --> <!-- intro -->CREATE CHAT MODEL support_assistant (
model='openai:gpt-4o-mini',
api_key='your-provider-api-key',
base_url='http://host.docker.internal:8787/v1',
timeout=60,
retrieval_limit=5,
max_document_length=3000
);
Common options:
| Option | Required | Description |
|---|---|---|
model | Yes | LLM model id in provider:model format. |
description | No | Stored description. |
api_key | No | Provider API key passed to the llm extension. |
base_url | No | Provider or proxy base URL. |
timeout | No | LLM request timeout, 1..65536. |
retrieval_limit | No | Number of documents requested from KNN, 1..50; default is 5. |
max_document_length | No | Per-document context limit. 0 disables truncation; 100..65536 truncates; default is 2000. |
Chat model names may contain only letters, numbers, and underscores.
The model option must use provider:model format:
model='openai:gpt-4o-mini'
Provider api_key is optional if the provider key is already available in Buddy's environment. For example, a Docker Compose service can pass provider keys like this:
environment:
- OPENAI_API_KEY=${OPENAI_API_KEY}
- OPENROUTER_API_KEY=${OPENROUTER_API_KEY}
If api_key is not set in CREATE CHAT MODEL, the llm extension can use the matching provider environment variable. Set api_key in the chat model only when you need this model to use a different key.
CALL CHAT(
'query',
'table',
'model_name',
'conversation_uuid',
'vector_field'
);
Arguments are positional only:
| Position | Argument | Required | Description |
|---|---|---|---|
| 1 | query | Yes | User question. |
| 2 | table | Yes | Table to search. |
| 3 | model_name | Yes | Chat model name. |
| 4 | conversation_uuid | No | Existing conversation id, or an empty string. |
| 5 | fields / vector field | No | FLOAT_VECTOR field used in knn(...). |
The table argument must be a plain table identifier, optionally qualified as database.table. The vector field argument must be a plain field identifier.
Use CALL CHAT with a query, a table, and a chat model.
CALL CHAT(
'What is vector search?',
'docs',
'assistant'
);
To continue a conversation, pass the same conversation UUID:
<!-- example conversational_search_continue_chat --> <!-- intro -->CALL CHAT(
'Can you explain it with an example?',
'docs',
'assistant',
'docs-chat-001'
);
To search a specific vector field, pass it as the fifth argument:
<!-- example conversational_search_vector_field --> <!-- intro -->CALL CHAT(
'Find documents where the title is about vector search',
'docs',
'assistant',
'',
'title_embedding'
);
When the fifth argument is present, Buddy checks that the field exists and is a FLOAT_VECTOR. If the argument is omitted, Buddy detects the first FLOAT_VECTOR field from SHOW CREATE TABLE.
When Buddy needs retrieval, it runs KNN search on the selected vector field and returns up to retrieval_limit rows. The default distance threshold is 0.8.
Buddy uses the retrieved rows as LLM context. The same rows are returned in sources, with knn_dist included and FLOAT_VECTOR columns removed.
max_document_length limits how much text from each source row can be sent to the LLM. Use 0 to disable truncation; otherwise use a value from 100 to 65536.
CALL CHAT returns one row:
| Column | Description |
|---|---|
conversation_uuid | Existing or generated conversation id. |
user_query | Original user query. |
search_query | Standalone search query used for retrieval. |
response | LLM answer. |
sources | JSON string containing retrieved source rows. |
Example response shape:
{
"conversation_uuid": "docs-chat-001",
"user_query": "What is vector search?",
"search_query": "vector search, embeddings, similarity search",
"response": "Vector search finds similar items by comparing embeddings...",
"sources": "[{\"id\":1,\"title\":\"Vector Search\",\"content\":\"...\",\"knn_dist\":0.12}]"
}
Vector fields are not included in sources.
List models:
<!-- example conversational_search_show_models --> <!-- intro -->SHOW CHAT MODELS;
Describe a model:
<!-- example conversational_search_describe_model --> <!-- intro -->DESCRIBE CHAT MODEL assistant;
Drop a model:
<!-- example conversational_search_drop_model --> <!-- intro -->DROP CHAT MODEL assistant;
Drop safely:
<!-- example conversational_search_drop_model_if_exists --> <!-- intro -->DROP CHAT MODEL IF EXISTS assistant;
SHOW CHAT MODELS returns name, model, and created_at. DESCRIBE CHAT MODEL returns property and value; stored API keys are shown as HIDDEN.
Dropping a chat model also drops that model's conversation history table. Conversation history is stored per model and written with a 30-day TTL.