examples/amazon-bedrock/models/README.md
You can run this example with:
npx promptfoo@latest init --example amazon-bedrock/models
cd amazon-bedrock/models
Set up your AWS credentials:
export AWS_ACCESS_KEY_ID="your_access_key"
export AWS_SECRET_ACCESS_KEY="your_secret_key"
See authentication docs for other auth methods, including SSO profiles.
Request model access in your AWS region:
Install required dependencies:
# For basic Bedrock models
npm install @aws-sdk/client-bedrock-runtime
# For Knowledge Base examples
npm install @aws-sdk/client-bedrock-agent-runtime
This directory contains several example configurations for different Bedrock models:
promptfooconfig.claude.yaml - Claude 4.6 Opus, Claude 4.1 Opus, Claude 4 Opus/Sonnet, Claude Haiku 4.5promptfooconfig.openai.yaml - OpenAI GPT-OSS models (120B and 20B) with reasoning effortpromptfooconfig.llama.yaml - Llama3promptfooconfig.mistral.yaml - Mistralpromptfooconfig.nova.yaml - Amazon's Nova modelspromptfooconfig.nova.tool.yaml - Nova with tool usage examplespromptfooconfig.nova.multimodal.yaml - Nova with multimodal capabilitiespromptfooconfig.titan-text.yaml - Titan text generation examplespromptfooconfig.kb.yaml - Knowledge Base RAG example with citations and contextTransformpromptfooconfig.inference-profiles.yaml - Comprehensive Application Inference Profiles example with multiple model typespromptfooconfig.inference-profiles-simple.yaml - Simple production-ready inference profile setup for high availabilitypromptfooconfig.yaml - Combined evaluation across multiple providerspromptfooconfig.nova-sonic.yaml - Amazon Nova Sonic model for audiopromptfooconfig.converse.yaml - Converse API with extended thinking (ultrathink)The Converse API example (promptfooconfig.converse.yaml) demonstrates the unified Bedrock Converse API with extended thinking (ultrathink) support.
showThinkingproviders:
- id: bedrock:converse:us.anthropic.claude-sonnet-4-6
label: Claude Sonnet 4.6 with Thinking
config:
region: us-west-2
maxTokens: 20000
thinking:
type: enabled
budget_tokens: 16000
showThinking: true
Run the Converse API example with:
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.converse.yaml
The Knowledge Base example (promptfooconfig.kb.yaml) demonstrates how to use AWS Bedrock Knowledge Base for Retrieval Augmented Generation (RAG).
For this example, you'll need to:
providers:
- id: bedrock:kb:us.anthropic.claude-sonnet-4-6
config:
region: 'us-east-2' # Change to your region
knowledgeBaseId: 'YOUR_KNOWLEDGE_BASE_ID' # Replace with your KB ID
When running the Knowledge Base example, you'll see:
contextTransform feature extracting context from citations for evaluationThe example includes questions about promptfoo configuration, providers, and evaluation techniques that work well with the embedded promptfoo documentation.
Note: You'll need to update the knowledgeBaseId with your actual Knowledge Base ID and ensure the Knowledge Base is configured to work with the selected Claude model.
For detailed Knowledge Base setup instructions, see the AWS Bedrock Knowledge Base Documentation.
The Application Inference Profiles example (promptfooconfig.inference-profiles.yaml) demonstrates how to use AWS Bedrock's inference profiles for multi-region failover and cost optimization.
When using inference profiles, you must specify the inferenceModelType parameter:
providers:
- id: bedrock:arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/my-profile
config:
inferenceModelType: 'claude' # Required!
region: 'us-east-1'
max_tokens: 1024
claude - Anthropic Claude modelsnova - Amazon Nova modelsllama - Defaults to Llama 4llama2, llama3, llama3.1, llama3.2, llama3.3, llama4 - Specific Llama versionsmistral - Mistral modelscohere - Cohere modelsai21 - AI21 modelstitan - Amazon Titan modelsdeepseek - DeepSeek models (with thinking capability)openai - OpenAI GPT-OSS modelsWe provide two inference profile examples:
Comprehensive Example (promptfooconfig.inference-profiles.yaml):
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.inference-profiles.yaml
This includes:
Simple Production Example (promptfooconfig.inference-profiles-simple.yaml):
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.inference-profiles-simple.yaml
This demonstrates:
Note: Replace the example ARNs with your actual application inference profile ARNs. To create an inference profile, visit the AWS Bedrock console and navigate to the "Application inference profiles" section.
The OpenAI example (promptfooconfig.openai.yaml) demonstrates OpenAI's GPT-OSS models available through AWS Bedrock:
low, medium, or high settingsmax_completion_tokensRun the OpenAI example with:
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.openai.yaml
The Converse API supports additional stop reason handling:
malformed_model_output: Model produced invalid outputmalformed_tool_use: Model produced a malformed tool use requestThese are returned as errors in the response with metadata.isModelError: true.
Nova Sonic now supports configurable timeouts:
providers:
- id: bedrock:nova-sonic:amazon.nova-sonic-v1:0
config:
region: us-east-1
sessionTimeout: 300000 # 5 minutes (default)
requestTimeout: 120000 # 2 minutes
Error responses include categorized error types in metadata.errorType:
connection: Network/AWS connectivity issuestimeout: Request or session timeoutapi: Authentication/authorization errorsparsing: Response parsing failuressession: Bidirectional stream session errorsRun the evaluation:
promptfoo eval -c [path/to/config.yaml]
View the results:
promptfoo view