examples/openai-deep-research/README.md
You can run this example with:
npx promptfoo@latest init --example openai-deep-research
cd openai-deep-research
This example demonstrates OpenAI's deep research models with web search capabilities via the Responses API.
⚠️ Response Times: Deep research models can take 2-10 minutes to complete research tasks as they perform extensive web searches and reasoning.
⚠️ Token Usage: These models use significant tokens for internal reasoning. Always set high max_output_tokens (50,000+) to avoid incomplete responses.
⚠️ Access: Deep research models may require special access from OpenAI. Check your API access if you encounter persistent 429 errors.
export OPENAI_API_KEY=your-key-here
# Set a 10-minute timeout for deep research tasks
export PROMPTFOO_EVAL_TIMEOUT_MS=600000
promptfoo eval
For local development:
PROMPTFOO_EVAL_TIMEOUT_MS=600000 npm run local -- eval -c examples/openai-deep-research/promptfooconfig.yaml
This example:
o4-mini-deep-research model with web search toolsThe model automatically decides when to use web search to provide comprehensive, up-to-date answers.
providers:
- id: openai:responses:o4-mini-deep-research
config:
max_output_tokens: 50000 # Required for complete research responses
tools:
- type: web_search_preview # Required for deep research models
# Optional parameters:
# max_tool_calls: 50 # Control number of searches (default: unlimited)
# background: true # Use background mode for long-running tasks
# store: true # Store the conversation for 30 days
o3-deep-research - Most powerful deep research model ($10/1M input, $40/1M output)o3-deep-research-2025-06-26 - Snapshot versiono4-mini-deep-research - Faster, more affordable ($2/1M input, $8/1M output)o4-mini-deep-research-2025-06-26 - Snapshot versionFor production use, run deep research tasks in background mode to avoid timeouts:
providers:
- id: openai:responses:o4-mini-deep-research
config:
background: true
webhook_url: https://your-api.com/webhook # Optional: Get notified when complete
Deep research models can analyze data using code:
providers:
- id: openai:responses:o4-mini-deep-research
config:
tools:
- type: web_search_preview
- type: code_interpreter
container:
type: auto
Connect to private data sources using MCP servers:
providers:
- id: openai:responses:o4-mini-deep-research
config:
tools:
- type: web_search_preview
- type: mcp
server_label: mycompany_mcp
server_url: https://mycompany.com/mcp
require_approval: never # Required for deep research
For better results, consider preprocessing user queries:
See the OpenAI Deep Research Guide for detailed examples.
Deep research responses include:
PROMPTFOO_EVAL_TIMEOUT_MS if evaluations time outmax_output_tokens to 50,000 or higherweb_search_preview is configuredmax_output_tokens: 50000 or higher