cookbook/litellm_proxy_server/braintrust_prompt_wrapper_README.md
This directory contains a wrapper server that enables LiteLLM to use prompts from Braintrust through the generic prompt management API.
┌─────────────┐ ┌──────────────────────┐ ┌─────────────┐
│ LiteLLM │ ──────> │ Wrapper Server │ ──────> │ Braintrust │
│ Client │ │ (This Server) │ │ API │
└─────────────┘ └──────────────────────┘ └─────────────┘
Uses generic Transforms Stores actual
prompt manager Braintrust format prompt templates
to LiteLLM format
litellm/integrations/generic_prompt_management/)A generic client that can work with any API implementing the /beta/litellm_prompt_management endpoint.
Expected API Response Format:
{
"prompt_id": "string",
"prompt_template": [
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello {name}"}
],
"prompt_template_model": "gpt-4",
"prompt_template_optional_params": {
"temperature": 0.7,
"max_tokens": 100
}
}
braintrust_prompt_wrapper_server.py)A FastAPI server that:
/beta/litellm_prompt_management endpointpip install fastapi uvicorn httpx litellm
export BRAINTRUST_API_KEY="your-braintrust-api-key"
python braintrust_prompt_wrapper_server.py
The server will start on http://localhost:8080 by default.
You can customize the port and host:
export PORT=8000
export HOST=0.0.0.0
python braintrust_prompt_wrapper_server.py
import litellm
from litellm.integrations.generic_prompt_management import GenericPromptManager
# Configure the generic prompt manager to use your wrapper server
generic_config = {
"api_base": "http://localhost:8080",
"api_key": "your-braintrust-api-key", # Will be passed to Braintrust
"timeout": 30,
}
# Create the prompt manager
prompt_manager = GenericPromptManager(**generic_config)
# Use with completion
response = litellm.completion(
model="generic_prompt/gpt-4",
prompt_id="your-braintrust-prompt-id",
prompt_variables={"name": "World"}, # Variables to substitute
messages=[{"role": "user", "content": "Additional message"}]
)
print(response)
You can also test the wrapper API directly:
# Test with curl
curl -H "Authorization: Bearer YOUR_BRAINTRUST_TOKEN" \
"http://localhost:8080/beta/litellm_prompt_management?prompt_id=YOUR_PROMPT_ID"
# Health check
curl http://localhost:8080/health
# Service info
curl http://localhost:8080/
Once the server is running, visit:
http://localhost:8080/docshttp://localhost:8080/redocThe wrapper automatically transforms Braintrust's response format:
Braintrust API Response:
{
"id": "prompt-123",
"prompt_data": {
"prompt": {
"type": "chat",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant"
}
]
},
"options": {
"model": "gpt-4",
"params": {
"temperature": 0.7,
"max_tokens": 100
}
}
}
}
Transformed to LiteLLM Format:
{
"prompt_id": "prompt-123",
"prompt_template": [
{
"role": "system",
"content": "You are a helpful assistant"
}
],
"prompt_template_model": "gpt-4",
"prompt_template_optional_params": {
"temperature": 0.7,
"max_tokens": 100
}
}
The wrapper automatically maps these Braintrust parameters to LiteLLM:
temperaturemax_tokens / max_completion_tokenstop_pfrequency_penaltypresence_penaltynstopresponse_formattool_choicefunction_calltoolsThe generic prompt manager supports simple variable substitution:
# In your Braintrust prompt:
# "Hello {name}, welcome to {place}!"
# In your code:
prompt_variables = {
"name": "Alice",
"place": "Wonderland"
}
# Result:
# "Hello Alice, welcome to Wonderland!"
Supports both {variable} and {{variable}} syntax.
The wrapper provides detailed error messages:
For production use:
Example with Docker:
FROM python:3.11-slim
WORKDIR /app
RUN pip install fastapi uvicorn httpx
COPY braintrust_prompt_wrapper_server.py .
ENV PORT=8080
ENV HOST=0.0.0.0
EXPOSE 8080
CMD ["python", "braintrust_prompt_wrapper_server.py"]
This pattern can be used with any prompt management provider:
/beta/litellm_prompt_managementExample providers:
BRAINTRUST_API_KEY environment variableAuthorization: Bearer TOKEN headerThis wrapper is part of the LiteLLM project and follows the same license.