Answer Relevancy Scorer Example

This example demonstrates how to use Mastra's Answer Relevancy Scorer to evaluate the relevance of LLM-generated responses to given inputs.

Prerequisites

Clone the repository and navigate to the project directory:

bash

git clone https://github.com/mastra-ai/mastra
cd examples/basics/scorers/answer-relevancy

The Answer Relevancy Scorer measures how well an LLM's response aligns with and addresses the given input question. It evaluates:

The example includes three scenarios:

High Relevancy: Where the response directly and completely answers the question
Partial Relevancy: Where the response partially addresses the question with some additional context
Low Relevancy: Where the response is completely unrelated to the question

Each scenario demonstrates:

The example will output:

The input query and generated response for each scenario
The scorer result with:
- Score (0-1, where 1 is perfectly relevant)
- Detailed reasoning for the score

createAnswerRelevancyScorer: Function that creates the scorer instance
Scorer configuration options:
- model: The language model to use for evaluation (e.g., OpenAI GPT-4)
- options.uncertaintyWeight: Weight for uncertain verdicts (default: 0.3)
- options.scale: Scale factor for the final score (default: 1)
scorer.run(): Method to evaluate input/output pairs
- Takes { input, output } where:
  - input: Array of chat messages (e.g., [{ role: 'user', content: 'question' }])
  - output: Response object (e.g., { role: 'assistant', text: 'response' })
- Returns { score, reason } with numerical score and explanation