examples/anthropic/opus-4-6-coding/README.md
This example demonstrates Claude Opus 4.6's state-of-the-art coding and reasoning capabilities, showcasing its ability to handle complex software engineering tasks with ambiguity and tradeoff analysis.
You can run this example with:
npx promptfoo@latest init --example anthropic/opus-4-6-coding
cd anthropic/opus-4-6-coding
Claude Opus 4.6 is the best model in the world for coding, agents, and computer use. This example evaluates:
# Set your API key
export ANTHROPIC_API_KEY=your_api_key_here
# Run the evaluation
npx promptfoo@latest eval
# View results
npx promptfoo@latest view
The evaluation tests Opus 4.6's ability to: