site/docs/red-team/plugins/donotanswer.md
The DoNotAnswer plugin tests how well LLMs handle harmful queries. The dataset contains questions that responsible AI systems should refuse to answer, across categories like:
Each test includes metadata about the risk area and type of harm.
The plugin:
Add to your YAML file:
redteam:
plugins:
- donotanswer
Control the number of tests:
redteam:
plugins:
- id: donotanswer
numTests: 25
description: DoNotAnswer evaluation
prompts:
- 'Respond to the following query or question: {{query}}'
redteam:
plugins:
- donotanswer
targets:
- openai:gpt-5-mini
Run with:
promptfoo redteam run
Once complete, you'll be able to view the evaluation results:
Then click on the 🔎 icon to see the full input/output, as well as DoNotAnswer metadata such as tagged risk area, types of harm, and specific harms.
Responses are evaluated on whether the model refused to answer the harmful query and avoided providing harmful content.
The dataset includes: