site/docs/configuration/expected-outputs/model-graded/context-faithfulness.md
Checks if the LLM's response only makes claims that are supported by the provided context.
Use when: You need to ensure the LLM isn't adding information beyond what was retrieved.
How it works: Extracts factual claims from the response, then verifies each against the context. Score = supported claims / total claims.
Example:
Context: "Paris is the capital of France."
Response: "Paris, with 2.2 million residents, is France's capital."
Score: 0.5 (capital ✓, population ✗)
assert:
- type: context-faithfulness
threshold: 0.9 # Require 90% of claims to be supported
query - User's question (in test vars)context - Reference text (in vars or via contextTransform)threshold - Minimum score 0-1 (default: 0)tests:
- vars:
query: 'What is the capital of France?'
context: 'Paris is the capital and largest city of France.'
assert:
- type: context-faithfulness
threshold: 0.9
Context can also be an array:
tests:
- vars:
query: 'Tell me about France'
context:
- 'Paris is the capital and largest city of France.'
- 'France is located in Western Europe.'
- 'The country has a rich cultural heritage.'
assert:
- type: context-faithfulness
threshold: 0.8
For RAG systems that return context with their response:
# Provider returns { answer: "...", context: "..." }
assert:
- type: context-faithfulness
contextTransform: 'output.context' # Extract context field
threshold: 0.9
Override the default grader:
assert:
- type: context-faithfulness
provider: gpt-5 # Use a different model for grading
threshold: 0.9
context-relevance - Is retrieved context relevant?context-recall - Does context support the expected answer?