Back to Agno

Test Log: 09_evals

cookbook/09_evals/TEST_LOG.md

2.6.4725 B
Original Source

Test Log: 09_evals

Tests not yet run. Run each file and update this log.

accuracy/*

Status: PENDING

Description: Accuracy evaluation examples for agents, teams, and evaluator configurations.


agent_as_judge/*

Status: PENDING

Description: Agent-as-judge examples covering scoring modes, hooks, and team cases.


performance/*

Status: PENDING

Description: Performance benchmark examples for runtime and memory impact.


performance/comparison/*

Status: PENDING

Description: Framework comparison benchmarks for agent instantiation.


reliability/*

Status: PENDING

Description: Reliability examples for expected tool-call behavior.