doc/MCP/FTS_Implementation_Plan.md
This document describes the current implementation of Full Text Search (FTS) capabilities in ProxySQL MCP. The FTS system enables AI agents to quickly search indexed database metadata and LLM-generated artifacts using SQLite's FTS5 extension.
Status: IMPLEMENTED ✅
MCP Query Endpoint
↓
Query_Tool_Handler (routes tool calls)
↓
Discovery_Schema (manages FTS database)
↓
SQLite FTS5 (mcp_catalog.db)
Integrated with Discovery Schema: FTS functionality is built into the existing mcp_catalog.db database.
FTS Tables:
fts_objects - FTS5 index over database objects (contentless)fts_llm - FTS5 index over LLM-generated artifacts (with content)Search indexed data using FTS5 across both database objects and LLM artifacts.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
| query | string | Yes | FTS5 search query |
| include_objects | boolean | No | Include detailed object information (default: false) |
| object_limit | integer | No | Max objects to return when include_objects=true (default: 50) |
Response:
{
"success": true,
"query": "customer order",
"results": [
{
"kind": "table",
"key": "sales.orders",
"schema_name": "sales",
"object_name": "orders",
"content": "orders table with columns: order_id, customer_id, order_date, total_amount",
"rank": 0.5
}
]
}
Implementation Logic:
fts_objects and fts_llm tables using FTS5Search LLM-generated content and insights using FTS5.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
| query | string | Yes | FTS5 search query |
| type | string | No | Content type to search ("summary", "relationship", "domain", "metric", "note") |
| schema | string | No | Filter by schema |
| limit | integer | No | Maximum results (default: 10) |
Response:
{
"success": true,
"query": "customer segmentation",
"results": [
{
"kind": "domain",
"key": "customer_segmentation",
"content": "Customer segmentation based on purchase behavior and demographics",
"rank": 0.8
}
]
}
Implementation Logic:
fts_llm table using FTS5Search indexed data using FTS5 across both database objects and LLM artifacts with detailed object information.
Parameters:
| Name | Type | Required | Description |
|---|---|---|---|
| query | string | Yes | FTS5 search query |
| include_objects | boolean | No | Include detailed object information (default: false) |
| object_limit | integer | No | Max objects to return when include_objects=true (default: 50) |
Response:
{
"success": true,
"query": "customer order",
"results": [
{
"kind": "table",
"key": "sales.orders",
"schema_name": "sales",
"object_name": "orders",
"content": "orders table with columns: order_id, customer_id, order_date, total_amount",
"rank": 0.5,
"details": {
"object_id": 123,
"object_type": "table",
"schema_name": "sales",
"object_name": "orders",
"row_count_estimate": 15000,
"has_primary_key": true,
"has_foreign_keys": true,
"has_time_column": true,
"columns": [
{
"column_name": "order_id",
"data_type": "int",
"is_nullable": false,
"is_primary_key": true
}
]
}
}
]
}
Implementation Logic:
fts_objects and fts_llm tables using FTS5objects, columns, indexes, foreign_keys tablesCREATE VIRTUAL TABLE fts_objects USING fts5(
schema_name,
object_name,
object_type,
content,
content='',
content_rowid='object_id'
);
CREATE VIRTUAL TABLE fts_llm USING fts5(
kind,
key,
content
);
Step 1: Integrate FTS into Discovery_Schema
lib/Discovery_Schema.cppmcp_catalog.db databaseStep 2: Create FTS tables
fts_objects for database objects (contentless)fts_llm for LLM artifacts (with content)Step 3: Implement automatic indexing
Step 4: Implement search tools
catalog_search tool in Query_Tool_Handlerllm.search tool in Query_Tool_HandlerStep 5: Register tools
include/Discovery_Schema.h - Added FTS methodslib/Discovery_Schema.cpp - Implemented FTS functionalitylib/Query_Tool_Handler.cpp - Added FTS tool routinginclude/Query_Tool_Handler.h - Added FTS tool declarationsclass Discovery_Schema {
private:
// FTS methods
int create_fts_tables();
int rebuild_fts_index(int run_id);
json search_fts(const std::string& query, bool include_objects = false, int object_limit = 50);
json search_llm_fts(const std::string& query, const std::string& type = "",
const std::string& schema = "", int limit = 10);
public:
// FTS is automatically maintained during:
// - Object insertion (static harvest)
// - LLM artifact upsertion
// - Catalog rebuild operations
};
json result;
result["success"] = false;
result["error"] = "Descriptive error message";
return result;
// Logging
proxy_error("FTS error: %s\n", error_msg);
proxy_info("FTS search completed: %zu results\n", result_count);
db->wrlock();
// Write operations (indexing)
db->wrunlock();
db->rdlock();
// Read operations (search)
db->rdunlock();
// Prepared statements
sqlite3_stmt* stmt = NULL;
db->prepare_v2(sql, &stmt);
(*proxy_sqlite3_bind_text)(stmt, 1, value.c_str(), -1, SQLITE_TRANSIENT);
SAFE_SQLITE3_STEP2(stmt);
(*proxy_sqlite3_finalize)(stmt);
# Agent searches for relevant objects
search_results = call_tool("catalog_search", {
"query": "customer orders with high value",
"include_objects": True,
"object_limit": 20
})
# Agent searches for LLM insights
llm_results = call_tool("llm.search", {
"query": "customer segmentation",
"type": "domain"
})
# Agent uses results to build understanding
for result in search_results["results"]:
if result["kind"] == "table":
# Get detailed table information
table_details = call_tool("catalog_get_object", {
"schema": result["schema_name"],
"object": result["object_name"]
})
fts_objects uses contentless indexing for performance