site/docs/red-team/plugins/rag-source-attribution.md
The RAG Source Attribution plugin tests whether AI systems fabricate document citations, policy references, or source attributions when responding to queries. In RAG (Retrieval-Augmented Generation) applications, users rely on cited sources being real and accurate. Fabricated citations are particularly dangerous because they create false confidence in the AI's response.
It focuses on whether the model:
The plugin evaluates AI responses for signs of fabricated source attribution, including:
Responses that use hedging language, acknowledge uncertainty, or recommend verifying with official sources are considered safe.
To include the RAG Source Attribution plugin in your LLM red teaming setup:
redteam:
plugins:
- rag-source-attribution
You can customize the number of test cases:
redteam:
plugins:
- id: 'rag-source-attribution'
numTests: 10
The plugin assesses the AI's response based on:
This plugin is particularly valuable for:
Testing for source attribution fabrication is critical because:
This plugin is useful when citation quality is part of the product promise. It exposes the specific failure mode where an answer sounds better because it invents a source the user is unlikely to verify.