examples/provider-elevenlabs/tts-advanced/README.md
This example demonstrates advanced TTS capabilities:
npx promptfoo@latest init --example provider-elevenlabs/tts-advanced
cd provider-elevenlabs/tts-advanced
export ELEVENLABS_API_KEY=your_api_key_here
npx promptfoo@latest eval
Control how technical terms, acronyms, and brand names are pronounced.
Use Case: Technical documentation, product demos, brand-specific content
providers:
- id: elevenlabs:tts
config:
pronunciationRules:
# Spell out acronyms
- word: API
pronunciation: A-P-I
# Custom pronunciation
- word: SQL
pronunciation: sequel
# Multi-word terms
- word: PostgreSQL
pronunciation: post-gres-Q-L
# Brand names
- word: OpenAI
pronunciation: open-A-I
Common Use Cases:
Technical Content
pronunciationRules:
- word: JavaScript
pronunciation: java-script
- word: TypeScript
pronunciation: type-script
- word: Python
pronunciation: pie-thon
- word: Node.js
pronunciation: node-jay-ess
- word: GraphQL
pronunciation: graph-Q-L
Medical/Scientific Terms
pronunciationRules:
- word: COVID-19
pronunciation: covid-nineteen
- word: mRNA
pronunciation: messenger-R-N-A
- word: DNA
pronunciation: D-N-A
Brand Names & Products
pronunciationRules:
- word: Anthropic
pronunciation: an-throw-pick
- word: Llama
pronunciation: lama
- word: ChatGPT
pronunciation: chat-G-P-T
Generate custom voices from natural language descriptions.
Use Case: Create unique voices for specific content types or brand identities
providers:
- id: elevenlabs:tts
config:
voiceDesign:
description: A warm, professional voice with excellent clarity and a slight smile in the tone, perfect for technical documentation
gender: female
age: middle_aged
accent: american
accentStrength: 0.5 # 0-2, subtle to strong
Voice Design Templates:
# Corporate Presenter
voiceDesign:
description: A confident, authoritative voice with clear articulation, perfect for business presentations
gender: male
age: middle_aged
accent: american
# Educational Instructor
voiceDesign:
description: A warm, patient voice with excellent clarity, ideal for educational content
gender: female
age: middle_aged
accent: british
# Customer Service
voiceDesign:
description: A friendly, approachable voice with a smile in the tone, great for customer interactions
gender: female
age: young
accent: american
# Podcast Host
voiceDesign:
description: A casual, engaging voice with natural conversational flow, perfect for podcasts
gender: male
age: young
accent: australian
# Audiobook Narrator
voiceDesign:
description: A deep, resonant voice with storytelling quality and emotional range
gender: male
age: middle_aged
accent: british
# Meditation Guide
voiceDesign:
description: A soothing, tranquil voice with calming tones and gentle pacing
gender: female
age: middle_aged
accent: american
accentStrength: 0.3
Modify existing voices to change their characteristics.
Use Case: Adapt pre-made voices for different contexts or emotions
providers:
# Make a voice more energetic
- id: elevenlabs:tts:energetic
config:
voiceId: 21m00Tcm4TlvDq8ikWAM # Rachel
voiceRemix:
style: energetic
pacing: fast
promptStrength: medium # low, medium, high, max
# Make a voice calmer and slower
- id: elevenlabs:tts:calm
config:
voiceId: 21m00Tcm4TlvDq8ikWAM
voiceRemix:
style: calm
pacing: slow
promptStrength: high
Remix Parameters:
| Parameter | Options | Use Case |
|---|---|---|
style | energetic, calm, professional, casual, dramatic | Match voice to content mood |
pacing | slow, normal, fast | Adjust speech speed |
gender | male, female | Change voice gender |
age | young, middle_aged, old | Adjust perceived age |
accent | american, british, australian, etc. | Change accent |
promptStrength | low, medium, high, max | How strongly to apply changes |
Common Remix Scenarios:
# Sports Commentary (Energetic & Fast)
voiceRemix:
style: energetic
pacing: fast
promptStrength: max
# ASMR Content (Calm & Slow)
voiceRemix:
style: calm
pacing: slow
promptStrength: high
# News Anchor (Professional & Measured)
voiceRemix:
style: professional
pacing: normal
promptStrength: medium
# Storytelling (Dramatic & Expressive)
voiceRemix:
style: dramatic
pacing: normal
promptStrength: high
Combine real-time streaming with custom pronunciation:
providers:
- id: elevenlabs:tts
config:
streaming: true
pronunciationRules:
- word: API
pronunciation: A-P-I
- word: WebSocket
pronunciation: web-socket
Benefits:
Create a custom voice with domain-specific pronunciation:
providers:
- id: elevenlabs:tts
config:
voiceDesign:
description: A friendly tech educator with clear pronunciation
gender: female
age: middle_aged
pronunciationRules:
- word: Python
pronunciation: pie-thon
- word: JavaScript
pronunciation: java-script
All advanced features use the same character-based pricing as basic TTS:
Cost Tracking:
tests:
- assert:
- type: cost
threshold: 0.05 # Max $0.05 per test
tests:
- description: Verify tech terms are included
vars:
expectedTerms:
- API
- SQL
- JavaScript
assert:
- type: javascript
value: |
const terms = context.vars.expectedTerms;
terms.every(term => output.includes(term))
tests:
- description: Compare baseline vs custom pronunciation
vars:
baseline: '{{providers[0].output}}'
custom: '{{providers[1].output}}'
assert:
- type: javascript
value: |
// Both should succeed
!context.vars.baseline.includes('error') &&
!context.vars.custom.includes('error')
tests:
- description: Ensure advanced features don't slow generation
assert:
- type: latency
threshold: 8000 # 8 seconds max
config:
voiceDesign:
description: Clear, professional voice for technical content
gender: female
age: middle_aged
pronunciationRules:
- word: API
pronunciation: A-P-I
- word: REST
pronunciation: rest
- word: GraphQL
pronunciation: graph-Q-L
- word: WebSocket
pronunciation: web-socket
- word: JSON
pronunciation: jay-sawn
- word: YAML
pronunciation: yam-mel
config:
voiceId: your-brand-voice-id
voiceRemix:
style: professional
pacing: normal
pronunciationRules:
- word: YourProduct
pronunciation: your-product
- word: YourCompany
pronunciation: your-company
# English with British accent
providers:
- id: elevenlabs:tts:en-gb
config:
voiceDesign:
description: British English speaker
accent: british
accentStrength: 1.5
# English with American accent
- id: elevenlabs:tts:en-us
config:
voiceDesign:
description: American English speaker
accent: american
accentStrength: 1.0
# Morning news (Energetic)
providers:
- id: elevenlabs:tts:morning
config:
voiceId: news-anchor-voice
voiceRemix:
style: energetic
pacing: fast
# Evening news (Calm)
- id: elevenlabs:tts:evening
config:
voiceId: news-anchor-voice
voiceRemix:
style: calm
pacing: normal
Error: Voice design failed
Solutions:
Warning: Pronunciation dictionary not found
Solutions:
pronunciationDictionaryId and pronunciationRulesIssue: Voice sounds the same after remix
Solutions:
promptStrength from medium to high or max| Option | Type | Description |
|---|---|---|
pronunciationRules | PronunciationRule[] | Array of pronunciation rules |
pronunciationDictionaryId | string | Use existing dictionary by ID |
PronunciationRule:
{
word: string; // Word to customize
pronunciation: string; // Phonetic pronunciation
phoneme?: string; // IPA/CMU phoneme (advanced)
alphabet?: 'ipa' | 'cmu'; // Phonetic alphabet
}
{
description: string; // Natural language description
gender?: 'male' | 'female';
age?: 'young' | 'middle_aged' | 'old';
accent?: string; // e.g., 'british', 'american'
accentStrength?: number; // 0-2, default 1.0
sampleText?: string; // Optional sample for preview
}
{
style?: string; // e.g., 'energetic', 'calm'
pacing?: 'slow' | 'normal' | 'fast';
gender?: 'male' | 'female';
age?: 'young' | 'middle_aged' | 'old';
accent?: string;
promptStrength?: 'low' | 'medium' | 'high' | 'max';
}