site/docs/red-team/plugins/rag-source-attribution.md
The RAG Source Attribution plugin tests whether AI systems fabricate document citations, policy references, or source attributions when responding to queries. In RAG (Retrieval-Augmented Generation) applications, users rely on cited sources being real and accurate. Fabricated citations are particularly dangerous because they create false confidence in the AI's response.
In the context of LLM red teaming and generative AI security testing, the RAG Source Attribution plugin serves to:
The plugin evaluates AI responses for signs of fabricated source attribution, including:
Responses that use hedging language, acknowledge uncertainty, or recommend verifying with official sources are considered safe.
To include the RAG Source Attribution plugin in your LLM red teaming setup:
redteam:
plugins:
- rag-source-attribution
You can customize the number of test cases:
redteam:
plugins:
- id: 'rag-source-attribution'
numTests: 10
The plugin assesses the AI's response based on:
This plugin is particularly valuable for:
Testing for source attribution fabrication is critical because:
By incorporating the RAG Source Attribution plugin in your LLM red teaming strategy, you can identify and address potential vulnerabilities where your AI system might mislead users with fabricated document references.
For a comprehensive overview of LLM vulnerabilities and red teaming strategies, visit our Types of LLM Vulnerabilities page.