Back to Proxysql

LLM Bridge - Generic LLM Access for ProxySQL

doc/LLM_Bridge/README.md

4.0.814.9 KB
Original Source

LLM Bridge - Generic LLM Access for ProxySQL

Overview

LLM Bridge is a ProxySQL feature that provides generic access to Large Language Models (LLMs) through the MySQL protocol. It allows you to send any prompt to an LLM and receive the response as a MySQL resultset.

Note: This feature was previously called "NL2SQL" (Natural Language to SQL) but has been converted to a generic LLM bridge. Future NL2SQL functionality will be implemented as a Web UI using external agents (Claude Code + MCP server).

Features

  • Generic Provider Support: Works with any OpenAI-compatible or Anthropic-compatible endpoint
  • Semantic Caching: Vector-based cache for similar prompts using sqlite-vec
  • Multi-Provider: Switch between LLM providers seamlessly
  • Versatile: Use LLMs for summarization, code generation, translation, analysis, etc.

Supported Endpoints:

  • Ollama (via OpenAI-compatible /v1/chat/completions endpoint)
  • OpenAI
  • Anthropic
  • vLLM
  • LM Studio
  • Z.ai
  • Any other OpenAI-compatible or Anthropic-compatible endpoint

Quick Start

1. Enable LLM Bridge

sql
-- Via admin interface
SET genai-llm_enabled='true';
LOAD GENAI VARIABLES TO RUNTIME;

2. Configure LLM Provider

ProxySQL uses a generic provider configuration that supports any OpenAI-compatible or Anthropic-compatible endpoint.

Using Ollama (default):

Ollama is used via its OpenAI-compatible endpoint:

sql
SET genai-llm_provider='openai';
SET genai-llm_provider_url='http://localhost:11434/v1/chat/completions';
SET genai-llm_provider_model='llama3.2';
SET genai-llm_provider_key='';  -- Empty for local Ollama
LOAD GENAI VARIABLES TO RUNTIME;

Using OpenAI:

sql
SET genai-llm_provider='openai';
SET genai-llm_provider_url='https://api.openai.com/v1/chat/completions';
SET genai-llm_provider_model='gpt-4';
SET genai-llm_provider_key='sk-...';  -- Your OpenAI API key
LOAD GENAI VARIABLES TO RUNTIME;

Using Anthropic:

sql
SET genai-llm_provider='anthropic';
SET genai-llm_provider_url='https://api.anthropic.com/v1/messages';
SET genai-llm_provider_model='claude-3-opus-20240229';
SET genai-llm_provider_key='sk-ant-...';  -- Your Anthropic API key
LOAD GENAI VARIABLES TO RUNTIME;

Using any OpenAI-compatible endpoint:

This works with any OpenAI-compatible API (vLLM, LM Studio, Z.ai, etc.):

sql
SET genai-llm_provider='openai';
SET genai-llm_provider_url='https://your-endpoint.com/v1/chat/completions';
SET genai-llm_provider_model='your-model-name';
SET genai-llm_provider_key='your-api-key';  -- Empty for local endpoints
LOAD GENAI VARIABLES TO RUNTIME;

3. Use the LLM Bridge

Once configured, you can send prompts using the /* LLM: */ prefix:

sql
-- Summarize text
mysql> /* LLM: */ Summarize the customer feedback from last week

-- Explain SQL queries
mysql> /* LLM: */ Explain this query: SELECT COUNT(*) FROM users WHERE active = 1

-- Generate code
mysql> /* LLM: */ Generate a Python function to validate email addresses

-- Translate text
mysql> /* LLM: */ Translate "Hello world" to Spanish

-- Analyze data
mysql> /* LLM: */ Analyze the following sales data and provide insights

Important: LLM queries are executed in the MySQL module (your regular SQL client), not in the ProxySQL Admin interface. The Admin interface is only for configuration.

Response Format

The LLM Bridge returns a resultset with the following columns:

ColumnDescription
text_responseThe LLM's text response
explanationWhich model/provider generated the response
cachedWhether the response was from cache (true/false)
providerThe provider used (openai/anthropic)

Configuration Variables

VariableDefaultDescription
genai-llm_enabledfalseMaster enable for LLM bridge
genai-llm_provideropenaiProvider type (openai/anthropic)
genai-llm_provider_urlhttp://localhost:11434/v1/chat/completionsLLM endpoint URL
genai-llm_provider_modelllama3.2Model name
genai-llm_provider_key(empty)API key (optional for local)
genai-llm_cache_enabledtrueEnable semantic cache
genai-llm_cache_similarity_threshold85Cache similarity threshold (0-100)
genai-llm_timeout_ms30000Request timeout in milliseconds

Request Configuration (Advanced)

When using LLM bridge programmatically, you can configure retry behavior:

ParameterDefaultDescription
max_retries3Maximum retry attempts for transient failures
retry_backoff_ms1000Initial backoff in milliseconds
retry_multiplier2.0Backoff multiplier for exponential backoff
retry_max_backoff_ms30000Maximum backoff in milliseconds
allow_cachetrueEnable semantic cache lookup

Error Handling

LLM Bridge provides structured error information to help diagnose issues:

Error CodeDescriptionHTTP Status
ERR_API_KEY_MISSINGAPI key not configuredN/A
ERR_API_KEY_INVALIDAPI key format is invalidN/A
ERR_TIMEOUTRequest timed outN/A
ERR_CONNECTION_FAILEDNetwork connection failed0
ERR_RATE_LIMITEDRate limited by provider429
ERR_SERVER_ERRORServer error500-599
ERR_EMPTY_RESPONSEEmpty response from LLMN/A
ERR_INVALID_RESPONSEMalformed response from LLMN/A
ERR_VALIDATION_FAILEDInput validation failedN/A
ERR_UNKNOWN_PROVIDERInvalid provider nameN/A
ERR_REQUEST_TOO_LARGERequest exceeds size limit413

Result Fields:

  • error_code: Structured error code (e.g., "ERR_API_KEY_MISSING")
  • error_details: Detailed error context with query, provider, URL
  • http_status_code: HTTP status code if applicable
  • provider_used: Which provider was attempted

Request Correlation

Each LLM request generates a unique request ID for log correlation:

LLM [a1b2c3d4-e5f6-7890-abcd-ef1234567890]: REQUEST url=http://... model=llama3.2
LLM [a1b2c3d4-e5f6-7890-abcd-ef1234567890]: RESPONSE status=200 duration_ms=1234

This allows tracing a single request through all log lines for debugging.

Use Cases

1. Text Summarization

sql
/* LLM: */ Summarize this text: [long text...]

2. Code Generation

sql
/* LLM: */ Write a Python function to check if a number is prime
/* LLM: */ Generate a SQL query to find duplicate users

3. Query Explanation

sql
/* LLM: */ Explain what this query does: SELECT * FROM orders WHERE status = 'pending'
/* LLM: */ Why is this query slow: SELECT * FROM users JOIN orders ON...

4. Data Analysis

sql
/* LLM: */ Analyze this CSV data and identify trends: [data...]
/* LLM: */ What insights can you derive from these sales figures?

5. Translation

sql
/* LLM: */ Translate "Good morning" to French, German, and Spanish
/* LLM: */ Convert this SQL query to PostgreSQL dialect

6. Documentation

sql
/* LLM: */ Write documentation for this function: [code...]
/* LLM: */ Generate API documentation for the users endpoint

7. Code Review

sql
/* LLM: */ Review this code for security issues: [code...]
/* LLM: */ Suggest optimizations for this query

Examples

Basic Usage

sql
-- Get a summary
mysql> /* LLM: */ What is machine learning?

-- Generate code
mysql> /* LLM: */ Write a function to calculate fibonacci numbers in JavaScript

-- Explain concepts
mysql> /* LLM: */ Explain the difference between INNER JOIN and LEFT JOIN

Complex Prompts

sql
-- Multi-step reasoning
mysql> /* LLM: */ Analyze the performance implications of using VARCHAR(255) vs TEXT in MySQL

-- Code with specific requirements
mysql> /* LLM: */ Write a Python script that reads a CSV file, filters rows where amount > 100, and outputs to JSON

-- Technical documentation
mysql> /* LLM: */ Create API documentation for a user registration endpoint with validation rules

Results

LLM Bridge returns a resultset with:

ColumnTypeDescription
text_responseTEXTLLM's text response
explanationTEXTWhich model was used
cachedBOOLEANWhether from semantic cache
error_codeTEXTStructured error code (if error)
error_detailsTEXTDetailed error context (if error)
http_status_codeINTHTTP status code (if applicable)
providerTEXTWhich provider was used

Example successful response:

+-------------------------------------------------------------+----------------------+------+----------+
| text_response                                               | explanation          | cached | provider |
+-------------------------------------------------------------+----------------------+------+----------+
| Machine learning is a subset of artificial intelligence   | Generated by llama3.2 |      0 | openai   |
| that enables systems to learn from data...               |                      |        |          |
+-------------------------------------------------------------+----------------------+------+----------+

Example error response:

+-----------------------------------------------------------------------+
| text_response                                                         |
+-----------------------------------------------------------------------+
| -- LLM processing failed                                              |
|                                                                       |
| error_code: ERR_API_KEY_MISSING                                       |
| error_details: LLM processing failed:                                 |
|   Query: What is machine learning?                                     |
|   Provider: openai                                                    |
|   URL: https://api.openai.com/v1/chat/completions                    |
|   Error: API key not configured                                       |
|                                                                       |
| http_status_code: 0                                                  |
| provider_used: openai                                                 |
+-----------------------------------------------------------------------+

Troubleshooting

LLM Bridge returns empty result

  1. Check AI module is initialized:

    sql
    SELECT * FROM runtime_mysql_servers WHERE variable_name LIKE 'ai_%';
    
  2. Verify LLM is accessible:

    bash
    # For Ollama
    curl http://localhost:11434/api/tags
    
    # For cloud APIs, check your API keys
    
  3. Check logs with request ID:

    bash
    # Find all log lines for a specific request
    tail -f proxysql.log | grep "LLM \[a1b2c3d4"
    
  4. Check error details:

    • Review error_code for structured error type
    • Review error_details for full context including query, provider, URL
    • Review http_status_code for HTTP-level errors (429 = rate limit, 500+ = server error)

Retry Behavior

LLM Bridge automatically retries on transient failures:

  • Rate limiting (HTTP 429): Retries with exponential backoff
  • Server errors (500-504): Retries with exponential backoff
  • Network errors: Retries with exponential backoff

Default retry behavior:

  • Maximum retries: 3
  • Initial backoff: 1000ms
  • Multiplier: 2.0x
  • Maximum backoff: 30000ms

Log output during retry:

LLM [request-id]: ERROR phase=llm error=Empty response status=0
LLM [request-id]: Retryable error (status=0), retrying in 1000ms (attempt 1/4)
LLM [request-id]: Request succeeded after 1 retries

Slow Responses

  1. Try a different model:

    sql
    SET genai-llm_provider_model='llama3.2';  -- Faster than GPT-4
    LOAD GENAI VARIABLES TO RUNTIME;
    
  2. Use local Ollama for faster responses:

    sql
    SET genai-llm_provider_url='http://localhost:11434/v1/chat/completions';
    LOAD GENAI VARIABLES TO RUNTIME;
    
  3. Increase timeout for complex prompts:

    sql
    SET genai-llm_timeout_ms=60000;
    LOAD GENAI VARIABLES TO RUNTIME;
    

Cache Issues

sql
-- Check cache stats
SHOW STATUS LIKE 'llm_%';

-- Cache is automatically managed based on semantic similarity
-- Adjust similarity threshold if needed
SET genai-llm_cache_similarity_threshold=80;  -- Lower = more matches
LOAD GENAI VARIABLES TO RUNTIME;

Status Variables

Monitor LLM bridge usage:

sql
SELECT * FROM stats_mysql_global WHERE variable_name LIKE 'llm_%';

Available status variables:

  • llm_total_requests - Total number of LLM requests
  • llm_cache_hits - Number of cache hits
  • llm_cache_misses - Number of cache misses
  • llm_local_model_calls - Calls to local models
  • llm_cloud_model_calls - Calls to cloud APIs
  • llm_total_response_time_ms - Total response time
  • llm_cache_total_lookup_time_ms - Total cache lookup time
  • llm_cache_total_store_time_ms - Total cache store time

Performance

OperationTypical Latency
Local Ollama~1-2 seconds
Cloud API~2-5 seconds
Cache hit< 50ms

Tips for better performance:

  • Use local Ollama for faster responses
  • Enable caching for repeated prompts
  • Use genai-llm_timeout_ms to limit wait time
  • Consider pre-warming cache with common prompts

Migration from NL2SQL

If you were using the old /* NL2SQL: */ prefix:

  1. Update your queries from /* NL2SQL: */ to /* LLM: */
  2. Update configuration variables from genai-nl2sql_* to genai-llm_*
  3. Note that the response format has changed:
    • Removed: sql_query, confidence columns
    • Added: text_response, provider columns
  4. The ai_nl2sql_convert MCP tool is deprecated and will return an error

Old NL2SQL Usage:

sql
/* NL2SQL: */ Show top 10 customers by revenue
-- Returns: sql_query, confidence, explanation, cached

New LLM Bridge Usage:

sql
/* LLM: */ Show top 10 customers by revenue
-- Returns: text_response, explanation, cached, provider

For true NL2SQL functionality (schema-aware SQL generation with iteration), consider using external agents that can:

  1. Analyze your database schema
  2. Iterate on query refinement
  3. Validate generated queries
  4. Execute and review results

Security

Important Notes

  • LLM responses are NOT executed automatically
  • Text responses are returned for review
  • Always validate generated code before execution
  • Keep API keys secure (use environment variables)

Best Practices

  1. Review generated code: Always check output before running
  2. Use read-only accounts: Test with limited permissions first
  3. Keep API keys secure: Don't commit them to version control
  4. Use caching wisely: Balance speed vs. data freshness
  5. Monitor usage: Check status variables regularly

API Reference

For complete API documentation, see API.md.

Architecture

For system architecture details, see ARCHITECTURE.md.

Testing

For testing information, see TESTING.md.

License

This feature is part of ProxySQL and follows the same license.