examples/provider-elevenlabs/alignment/README.md
Generate time-aligned subtitles (SRT/VTT) from audio and transcripts using ElevenLabs forced alignment.
npx promptfoo@latest init --example provider-elevenlabs/alignment
cd provider-elevenlabs/alignment
export ELEVENLABS_API_KEY=your_api_key_here
npx promptfoo@latest eval
Forced alignment takes two inputs:
It returns precise timestamps showing when each word was spoken, formatted as subtitles.
{
"alignment": [
{ "char": "T", "start": 0.0, "end": 0.1 },
{ "char": "h", "start": 0.1, "end": 0.15 }
],
"characters": "That's one small step..."
}
1
00:00:00,000 --> 00:00:02,500
That's one small step for man
2
00:00:02,500 --> 00:00:05,000
one giant leap for mankind
WEBVTT
1
00:00:00.000 --> 00:00:02.500
That's one small step for man
2
00:00:02.500 --> 00:00:05.000
one giant leap for mankind
providers:
- id: elevenlabs:alignment:json
label: Alignment (JSON)
tests:
- vars:
audioFile: path/to/audio.mp3
transcript: 'Your transcript text here'
format: json
providers:
- id: elevenlabs:alignment:srt
label: Alignment (SRT Subtitles)
tests:
- vars:
audioFile: path/to/audio.mp3
transcript: 'Your transcript text here'
format: srt
providers:
- id: elevenlabs:alignment:vtt
label: Alignment (VTT Subtitles)
tests:
- vars:
audioFile: path/to/audio.mp3
transcript: 'Your transcript text here'
format: vtt
tests:
# Verify alignment succeeds
- assert:
- type: javascript
value: output.includes('words') # JSON format
- type: not-contains
value: error
# Verify SRT format
- assert:
- type: javascript
value: output.includes('-->') && output.includes('small step')
<video> tag)Forced alignment pricing is based on audio duration:
The provider automatically tracks costs in evaluation results.