document/content/docs/introduction/guide/dashboard/evaluation.en.mdx
Starting from FastGPT v4.11.0, batch app evaluation is supported. By providing multiple QA pairs, the system automatically scores your app's responses, enabling quantitative assessment of app performance.
The system supports three evaluation metrics: answer accuracy, question relevance, and semantic accuracy. The current beta only includes answer accuracy — the remaining metrics will be added in future releases.
Navigate to the App Evaluation section under Workspace and click the "Create Task" button in the upper right corner.
On the task creation page, provide the following:
After selecting the target app, a button appears to download the CSV template. The template includes these fields:
Notes:
Upload the completed file and click "Start Evaluation" to create the task.
The evaluation list shows all tasks with key information:
Use this to compare results across iterations as you improve your app.
Click "View Details" to open the detail page:
Task Overview: The top section shows overall task information, including evaluation configuration and summary statistics.
Detailed Results: The bottom section lists each QA pair with its score, showing: