examples/provider-elevenlabs/agents/README.md
You can run this example with:
npx promptfoo@latest init --example provider-elevenlabs/agents
cd provider-elevenlabs/agents
Test and evaluate ElevenLabs voice AI agents with multi-turn conversations.
Set your ElevenLabs API key:
export ELEVENLABS_API_KEY=your_api_key_here
npx promptfoo@latest eval -c ./promptfooconfig.yaml
Or view in the UI:
npx promptfoo@latest eval -c ./promptfooconfig.yaml
npx promptfoo@latest view
This example supports multiple input formats:
prompts:
- 'Hello, I need help with my order'
prompts:
- |
User: Hi, what's the weather like?
Agent: I'd be happy to help! Where are you located?
User: I'm in San Francisco
prompts:
- |
{
"turns": [
{"speaker": "user", "message": "Hello"},
{"speaker": "agent", "message": "Hi! How can I help?"},
{"speaker": "user", "message": "I need support"}
]
}
Customize the agent behavior:
config:
agentConfig:
name: Customer Support Agent
prompt: You are a helpful, empathetic customer support agent...
firstMessage: Hi! I'm here to help. What can I do for you today?
language: en
voiceId: 21m00Tcm4TlvDq8ikWAM
llmModel: gpt-4o
temperature: 0.7
maxTokens: 500
Common criteria presets available:
greeting - Professional greeting (weight: 0.8, threshold: 0.8)understanding - Accurate intent understanding (weight: 1.0, threshold: 0.9)accuracy - Correct information (weight: 1.0, threshold: 0.9)helpfulness - Helpful responses (weight: 0.9, threshold: 0.8)professionalism - Professional tone (weight: 0.7, threshold: 0.8)empathy - Empathetic responses (weight: 0.8, threshold: 0.7)efficiency - Concise responses (weight: 0.7, threshold: 0.7)resolution - Problem resolution (weight: 1.0, threshold: 0.8)Configure the simulated user's behavior:
simulatedUser:
prompt: Act as a customer who is frustrated but polite
temperature: 0.8
responseStyle: casual # concise | verbose | casual | formal
Example tools for agents:
get_weather - Get current weathersearch_knowledge_base - Search documentationcreate_ticket - Create support ticketsend_email - Send email notificationget_order_status - Check order statusschedule_callback - Schedule callbacktransfer_agent - Transfer to human agent