litellm/llms/openai/speech/guardrail_translation/README.md
Handler for processing OpenAI's text-to-speech endpoint (/v1/audio/speech) with guardrails.
This handler processes text-to-speech requests by:
{
"model": "tts-1",
"input": "The quick brown fox jumped over the lazy dog.",
"voice": "alloy",
"response_format": "mp3",
"speed": 1.0
}
The output is binary audio data (MP3, WAV, etc.), not text, so it cannot be guardrailed.
The handler is automatically discovered and applied when guardrails are used with the text-to-speech endpoint.
curl -X POST 'http://localhost:4000/v1/audio/speech' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
"model": "tts-1",
"input": "The quick brown fox jumped over the lazy dog.",
"voice": "alloy",
"guardrails": ["content_moderation"]
}' \
--output speech.mp3
The guardrail will be applied to the input text before the text-to-speech conversion.
curl -X POST 'http://localhost:4000/v1/audio/speech' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
"model": "tts-1",
"input": "Please call John Doe at [email protected]",
"voice": "nova",
"guardrails": ["mask_pii"]
}' \
--output speech.mp3
The audio will say: "Please call [NAME_REDACTED] at [EMAIL_REDACTED]"
curl -X POST 'http://localhost:4000/v1/audio/speech' \
-H 'Content-Type: application/json' \
-H 'Authorization: Bearer your-api-key' \
-d '{
"model": "tts-1-hd",
"input": "This is the text that will be spoken",
"voice": "shimmer",
"guardrails": ["content_filter"]
}' \
--output speech.mp3
input (string)Override these methods to customize behavior:
process_input_messages(): Customize how input text is processedprocess_output_response(): Currently a no-op, but can be overridden if neededCallTypes.speech - Synchronous text-to-speechCallTypes.aspeech - Asynchronous text-to-speechimport litellm
from pathlib import Path
speech_file_path = Path(__file__).parent / "speech.mp3"
response = litellm.speech(
model="tts-1",
voice="alloy",
input="Hi, this is John Doe calling from [email protected]",
guardrails=["mask_pii"],
)
response.stream_to_file(speech_file_path)
# Audio will have PII masked
import litellm
from pathlib import Path
speech_file_path = Path(__file__).parent / "speech.mp3"
response = litellm.speech(
model="tts-1-hd",
voice="nova",
input="Your text here",
guardrails=["content_moderation"],
)
response.stream_to_file(speech_file_path)
import litellm
import asyncio
from pathlib import Path
async def generate_speech():
speech_file_path = Path(__file__).parent / "speech.mp3"
response = await litellm.aspeech(
model="tts-1",
voice="echo",
input="Text to convert to speech",
guardrails=["pii_mask"],
)
response.stream_to_file(speech_file_path)
asyncio.run(generate_speech())