doc/LLM_Bridge/README.md
LLM Bridge is a ProxySQL feature that provides generic access to Large Language Models (LLMs) through the MySQL protocol. It allows you to send any prompt to an LLM and receive the response as a MySQL resultset.
Note: This feature was previously called "NL2SQL" (Natural Language to SQL) but has been converted to a generic LLM bridge. Future NL2SQL functionality will be implemented as a Web UI using external agents (Claude Code + MCP server).
Supported Endpoints:
/v1/chat/completions endpoint)-- Via admin interface
SET genai-llm_enabled='true';
LOAD GENAI VARIABLES TO RUNTIME;
ProxySQL uses a generic provider configuration that supports any OpenAI-compatible or Anthropic-compatible endpoint.
Using Ollama (default):
Ollama is used via its OpenAI-compatible endpoint:
SET genai-llm_provider='openai';
SET genai-llm_provider_url='http://localhost:11434/v1/chat/completions';
SET genai-llm_provider_model='llama3.2';
SET genai-llm_provider_key=''; -- Empty for local Ollama
LOAD GENAI VARIABLES TO RUNTIME;
Using OpenAI:
SET genai-llm_provider='openai';
SET genai-llm_provider_url='https://api.openai.com/v1/chat/completions';
SET genai-llm_provider_model='gpt-4';
SET genai-llm_provider_key='sk-...'; -- Your OpenAI API key
LOAD GENAI VARIABLES TO RUNTIME;
Using Anthropic:
SET genai-llm_provider='anthropic';
SET genai-llm_provider_url='https://api.anthropic.com/v1/messages';
SET genai-llm_provider_model='claude-3-opus-20240229';
SET genai-llm_provider_key='sk-ant-...'; -- Your Anthropic API key
LOAD GENAI VARIABLES TO RUNTIME;
Using any OpenAI-compatible endpoint:
This works with any OpenAI-compatible API (vLLM, LM Studio, Z.ai, etc.):
SET genai-llm_provider='openai';
SET genai-llm_provider_url='https://your-endpoint.com/v1/chat/completions';
SET genai-llm_provider_model='your-model-name';
SET genai-llm_provider_key='your-api-key'; -- Empty for local endpoints
LOAD GENAI VARIABLES TO RUNTIME;
Once configured, you can send prompts using the /* LLM: */ prefix:
-- Summarize text
mysql> /* LLM: */ Summarize the customer feedback from last week
-- Explain SQL queries
mysql> /* LLM: */ Explain this query: SELECT COUNT(*) FROM users WHERE active = 1
-- Generate code
mysql> /* LLM: */ Generate a Python function to validate email addresses
-- Translate text
mysql> /* LLM: */ Translate "Hello world" to Spanish
-- Analyze data
mysql> /* LLM: */ Analyze the following sales data and provide insights
Important: LLM queries are executed in the MySQL module (your regular SQL client), not in the ProxySQL Admin interface. The Admin interface is only for configuration.
The LLM Bridge returns a resultset with the following columns:
| Column | Description |
|---|---|
text_response | The LLM's text response |
explanation | Which model/provider generated the response |
cached | Whether the response was from cache (true/false) |
provider | The provider used (openai/anthropic) |
| Variable | Default | Description |
|---|---|---|
genai-llm_enabled | false | Master enable for LLM bridge |
genai-llm_provider | openai | Provider type (openai/anthropic) |
genai-llm_provider_url | http://localhost:11434/v1/chat/completions | LLM endpoint URL |
genai-llm_provider_model | llama3.2 | Model name |
genai-llm_provider_key | (empty) | API key (optional for local) |
genai-llm_cache_enabled | true | Enable semantic cache |
genai-llm_cache_similarity_threshold | 85 | Cache similarity threshold (0-100) |
genai-llm_timeout_ms | 30000 | Request timeout in milliseconds |
When using LLM bridge programmatically, you can configure retry behavior:
| Parameter | Default | Description |
|---|---|---|
max_retries | 3 | Maximum retry attempts for transient failures |
retry_backoff_ms | 1000 | Initial backoff in milliseconds |
retry_multiplier | 2.0 | Backoff multiplier for exponential backoff |
retry_max_backoff_ms | 30000 | Maximum backoff in milliseconds |
allow_cache | true | Enable semantic cache lookup |
LLM Bridge provides structured error information to help diagnose issues:
| Error Code | Description | HTTP Status |
|---|---|---|
ERR_API_KEY_MISSING | API key not configured | N/A |
ERR_API_KEY_INVALID | API key format is invalid | N/A |
ERR_TIMEOUT | Request timed out | N/A |
ERR_CONNECTION_FAILED | Network connection failed | 0 |
ERR_RATE_LIMITED | Rate limited by provider | 429 |
ERR_SERVER_ERROR | Server error | 500-599 |
ERR_EMPTY_RESPONSE | Empty response from LLM | N/A |
ERR_INVALID_RESPONSE | Malformed response from LLM | N/A |
ERR_VALIDATION_FAILED | Input validation failed | N/A |
ERR_UNKNOWN_PROVIDER | Invalid provider name | N/A |
ERR_REQUEST_TOO_LARGE | Request exceeds size limit | 413 |
Result Fields:
error_code: Structured error code (e.g., "ERR_API_KEY_MISSING")error_details: Detailed error context with query, provider, URLhttp_status_code: HTTP status code if applicableprovider_used: Which provider was attemptedEach LLM request generates a unique request ID for log correlation:
LLM [a1b2c3d4-e5f6-7890-abcd-ef1234567890]: REQUEST url=http://... model=llama3.2
LLM [a1b2c3d4-e5f6-7890-abcd-ef1234567890]: RESPONSE status=200 duration_ms=1234
This allows tracing a single request through all log lines for debugging.
/* LLM: */ Summarize this text: [long text...]
/* LLM: */ Write a Python function to check if a number is prime
/* LLM: */ Generate a SQL query to find duplicate users
/* LLM: */ Explain what this query does: SELECT * FROM orders WHERE status = 'pending'
/* LLM: */ Why is this query slow: SELECT * FROM users JOIN orders ON...
/* LLM: */ Analyze this CSV data and identify trends: [data...]
/* LLM: */ What insights can you derive from these sales figures?
/* LLM: */ Translate "Good morning" to French, German, and Spanish
/* LLM: */ Convert this SQL query to PostgreSQL dialect
/* LLM: */ Write documentation for this function: [code...]
/* LLM: */ Generate API documentation for the users endpoint
/* LLM: */ Review this code for security issues: [code...]
/* LLM: */ Suggest optimizations for this query
-- Get a summary
mysql> /* LLM: */ What is machine learning?
-- Generate code
mysql> /* LLM: */ Write a function to calculate fibonacci numbers in JavaScript
-- Explain concepts
mysql> /* LLM: */ Explain the difference between INNER JOIN and LEFT JOIN
-- Multi-step reasoning
mysql> /* LLM: */ Analyze the performance implications of using VARCHAR(255) vs TEXT in MySQL
-- Code with specific requirements
mysql> /* LLM: */ Write a Python script that reads a CSV file, filters rows where amount > 100, and outputs to JSON
-- Technical documentation
mysql> /* LLM: */ Create API documentation for a user registration endpoint with validation rules
LLM Bridge returns a resultset with:
| Column | Type | Description |
|---|---|---|
text_response | TEXT | LLM's text response |
explanation | TEXT | Which model was used |
cached | BOOLEAN | Whether from semantic cache |
error_code | TEXT | Structured error code (if error) |
error_details | TEXT | Detailed error context (if error) |
http_status_code | INT | HTTP status code (if applicable) |
provider | TEXT | Which provider was used |
Example successful response:
+-------------------------------------------------------------+----------------------+------+----------+
| text_response | explanation | cached | provider |
+-------------------------------------------------------------+----------------------+------+----------+
| Machine learning is a subset of artificial intelligence | Generated by llama3.2 | 0 | openai |
| that enables systems to learn from data... | | | |
+-------------------------------------------------------------+----------------------+------+----------+
Example error response:
+-----------------------------------------------------------------------+
| text_response |
+-----------------------------------------------------------------------+
| -- LLM processing failed |
| |
| error_code: ERR_API_KEY_MISSING |
| error_details: LLM processing failed: |
| Query: What is machine learning? |
| Provider: openai |
| URL: https://api.openai.com/v1/chat/completions |
| Error: API key not configured |
| |
| http_status_code: 0 |
| provider_used: openai |
+-----------------------------------------------------------------------+
Check AI module is initialized:
SELECT * FROM runtime_mysql_servers WHERE variable_name LIKE 'ai_%';
Verify LLM is accessible:
# For Ollama
curl http://localhost:11434/api/tags
# For cloud APIs, check your API keys
Check logs with request ID:
# Find all log lines for a specific request
tail -f proxysql.log | grep "LLM \[a1b2c3d4"
Check error details:
error_code for structured error typeerror_details for full context including query, provider, URLhttp_status_code for HTTP-level errors (429 = rate limit, 500+ = server error)LLM Bridge automatically retries on transient failures:
Default retry behavior:
Log output during retry:
LLM [request-id]: ERROR phase=llm error=Empty response status=0
LLM [request-id]: Retryable error (status=0), retrying in 1000ms (attempt 1/4)
LLM [request-id]: Request succeeded after 1 retries
Try a different model:
SET genai-llm_provider_model='llama3.2'; -- Faster than GPT-4
LOAD GENAI VARIABLES TO RUNTIME;
Use local Ollama for faster responses:
SET genai-llm_provider_url='http://localhost:11434/v1/chat/completions';
LOAD GENAI VARIABLES TO RUNTIME;
Increase timeout for complex prompts:
SET genai-llm_timeout_ms=60000;
LOAD GENAI VARIABLES TO RUNTIME;
-- Check cache stats
SHOW STATUS LIKE 'llm_%';
-- Cache is automatically managed based on semantic similarity
-- Adjust similarity threshold if needed
SET genai-llm_cache_similarity_threshold=80; -- Lower = more matches
LOAD GENAI VARIABLES TO RUNTIME;
Monitor LLM bridge usage:
SELECT * FROM stats_mysql_global WHERE variable_name LIKE 'llm_%';
Available status variables:
llm_total_requests - Total number of LLM requestsllm_cache_hits - Number of cache hitsllm_cache_misses - Number of cache missesllm_local_model_calls - Calls to local modelsllm_cloud_model_calls - Calls to cloud APIsllm_total_response_time_ms - Total response timellm_cache_total_lookup_time_ms - Total cache lookup timellm_cache_total_store_time_ms - Total cache store time| Operation | Typical Latency |
|---|---|
| Local Ollama | ~1-2 seconds |
| Cloud API | ~2-5 seconds |
| Cache hit | < 50ms |
Tips for better performance:
genai-llm_timeout_ms to limit wait timeIf you were using the old /* NL2SQL: */ prefix:
/* NL2SQL: */ to /* LLM: */genai-nl2sql_* to genai-llm_*sql_query, confidence columnstext_response, provider columnsai_nl2sql_convert MCP tool is deprecated and will return an error/* NL2SQL: */ Show top 10 customers by revenue
-- Returns: sql_query, confidence, explanation, cached
/* LLM: */ Show top 10 customers by revenue
-- Returns: text_response, explanation, cached, provider
For true NL2SQL functionality (schema-aware SQL generation with iteration), consider using external agents that can:
For complete API documentation, see API.md.
For system architecture details, see ARCHITECTURE.md.
For testing information, see TESTING.md.
This feature is part of ProxySQL and follows the same license.