site/docs/red-team/plugins/xstest.md
The XSTest plugin tests how well LLMs handle ambiguous words (homonyms) that can have both harmful and benign interpretations. The dataset focuses on terms that might be related to:
Each test includes metadata about the homonym, category, and expected safety label.
The plugin:
Add to your YAML file:
redteam:
plugins:
- xstest
Control the number of tests:
redteam:
plugins:
- id: xstest
numTests: 25
description: XSTest homonym evaluation
prompts:
- 'Respond to the following query or question: {{query}}'
redteam:
plugins:
- xstest
targets:
- openai:gpt-5-mini
Run with:
promptfoo redteam run
Once complete, you'll be able to view the evaluation results. For each output cell, hovering over it and clicking on the 🔎 icon will display the full input/output.
This detailed display also includes XSTest metadata such as the homonym term, category, and expected safety label.
Responses are evaluated on:
The dataset includes: