examples/amazon-bedrock/models/README.md
You can run this example with:
npx promptfoo@latest init --example amazon-bedrock/models
cd amazon-bedrock/models
Set up your AWS credentials:
export AWS_ACCESS_KEY_ID="your_access_key"
export AWS_SECRET_ACCESS_KEY="your_secret_key"
See authentication docs for other auth methods, including SSO profiles.
Request model access in your AWS region:
Install required dependencies:
# For basic Bedrock models
npm install @aws-sdk/client-bedrock-runtime
# For Knowledge Base examples
npm install @aws-sdk/client-bedrock-agent-runtime
This directory contains several example configurations for different Bedrock models:
promptfooconfig.claude.yaml - Claude 4.6 Opus, Claude 4.1 Opus, Claude 4 Opus/Sonnet, Claude Haiku 4.5promptfooconfig.openai.yaml - OpenAI GPT-OSS models (120B and 20B) with reasoning effortpromptfooconfig.llama.yaml - Llama3promptfooconfig.mistral.yaml - Mistralpromptfooconfig.nova.yaml - Amazon's Nova modelspromptfooconfig.nova.tool.yaml - Nova with tool usage examplespromptfooconfig.nova.multimodal.yaml - Nova with multimodal capabilitiespromptfooconfig.titan-text.yaml - Titan text generation examplespromptfooconfig.kb.yaml - Knowledge Base RAG example with citations and contextTransformpromptfooconfig.inference-profiles.yaml - Comprehensive Application Inference Profiles example with multiple model typespromptfooconfig.inference-profiles-simple.yaml - Simple production-ready inference profile setup for high availabilitypromptfooconfig.yaml - Combined evaluation across multiple providerspromptfooconfig.nova-sonic.yaml - Amazon Nova Sonic model for audiopromptfooconfig.converse.yaml - Converse API with extended thinking (ultrathink)promptfooconfig.converse-mcp.yaml - Converse API with Model Context Protocol (MCP) toolsThe Converse API example (promptfooconfig.converse.yaml) demonstrates the unified Bedrock Converse API with extended thinking (ultrathink) support.
showThinkingproviders:
- id: bedrock:converse:us.anthropic.claude-sonnet-4-6
label: Claude Sonnet 4.6 with Thinking
config:
region: us-west-2
maxTokens: 20000
thinking:
type: enabled
budget_tokens: 16000
showThinking: true
Run the Converse API example with:
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.converse.yaml
The Converse MCP example (promptfooconfig.converse-mcp.yaml) demonstrates how to attach Model Context Protocol (MCP) servers to a Bedrock Converse provider. MCP tools are discovered from the configured server, converted to Bedrock Converse tool definitions, and executed when the model requests a tool call.
providers:
- id: bedrock:converse:us.anthropic.claude-sonnet-4-6
label: Claude Sonnet 4.6 with MCP
config:
region: us-east-1
maxTokens: 1024
temperature: 0
mcp:
enabled: true
servers:
- name: deepwiki
url: https://mcp.deepwiki.com/mcp
tools:
- ask_question
toolChoice: auto
Run the Converse MCP example with:
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.converse-mcp.yaml
Replace the servers entry with a local command/args, path, or another remote url to use your own MCP server.
Note: When the model emits
tool_use, the provider executes the requested MCP tool and returns the raw tool result as the eval output. There is no follow-up Converse turn that feeds the tool result back to the model for a synthesized answer, so the assertions in this example match substrings present in the MCP server's response. If you need a model-summarized answer, wrap the provider in an agent harness or run a second eval over the captured tool output.
The Knowledge Base example (promptfooconfig.kb.yaml) demonstrates how to use AWS Bedrock Knowledge Base for Retrieval Augmented Generation (RAG).
For this example, you'll need to:
providers:
- id: bedrock:kb:us.anthropic.claude-sonnet-4-6
config:
region: 'us-east-2' # Change to your region
knowledgeBaseId: 'YOUR_KNOWLEDGE_BASE_ID' # Replace with your KB ID
When running the Knowledge Base example, you'll see:
contextTransform feature extracting context from citations for evaluationThe example includes questions about promptfoo configuration, providers, and evaluation techniques that work well with the embedded promptfoo documentation.
Note: You'll need to update the knowledgeBaseId with your actual Knowledge Base ID and ensure the Knowledge Base is configured to work with the selected Claude model.
For detailed Knowledge Base setup instructions, see the AWS Bedrock Knowledge Base Documentation.
The Application Inference Profiles example (promptfooconfig.inference-profiles.yaml) demonstrates how to use AWS Bedrock's inference profiles for multi-region failover and cost optimization.
When using inference profiles, you must specify the inferenceModelType parameter:
providers:
- id: bedrock:arn:aws:bedrock:us-east-1:123456789012:application-inference-profile/my-profile
config:
inferenceModelType: 'claude' # Required!
region: 'us-east-1'
max_tokens: 1024
claude - Anthropic Claude modelsnova - Amazon Nova modelsllama - Defaults to Llama 4llama2, llama3, llama3.1, llama3.2, llama3.3, llama4 - Specific Llama versionsmistral - Mistral modelscohere - Cohere modelsai21 - AI21 modelstitan - Amazon Titan modelsdeepseek - DeepSeek models (with thinking capability)openai - OpenAI GPT-OSS modelsWe provide two inference profile examples:
Comprehensive Example (promptfooconfig.inference-profiles.yaml):
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.inference-profiles.yaml
This includes:
Simple Production Example (promptfooconfig.inference-profiles-simple.yaml):
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.inference-profiles-simple.yaml
This demonstrates:
Note: Replace the example ARNs with your actual application inference profile ARNs. To create an inference profile, visit the AWS Bedrock console and navigate to the "Application inference profiles" section.
The OpenAI example (promptfooconfig.openai.yaml) demonstrates OpenAI's GPT-OSS models available through AWS Bedrock:
low, medium, or high settingsmax_completion_tokensRun the OpenAI example with:
promptfoo eval -c examples/amazon-bedrock/models/promptfooconfig.openai.yaml
The Converse API supports additional stop reason handling:
malformed_model_output: Model produced invalid outputmalformed_tool_use: Model produced a malformed tool use requestThese are returned as errors in the response with metadata.isModelError: true.
Nova Sonic now supports configurable timeouts:
providers:
- id: bedrock:nova-sonic:amazon.nova-sonic-v1:0
config:
region: us-east-1
sessionTimeout: 300000 # 5 minutes (default)
requestTimeout: 120000 # 2 minutes
Error responses include categorized error types in metadata.errorType:
connection: Network/AWS connectivity issuestimeout: Request or session timeoutapi: Authentication/authorization errorsparsing: Response parsing failuressession: Bidirectional stream session errorsRun the evaluation:
promptfoo eval -c [path/to/config.yaml]
View the results:
promptfoo view