Back to Agno

Test Log: accuracy

cookbook/09_evals/accuracy/TEST_LOG.md

2.6.41.2 KB
Original Source

Test Log: accuracy

Tests not yet run. Run each file and update this log.

accuracy_basic.py

Status: PENDING

Description: Runs sync and async calculator accuracy evaluations.


accuracy_9_11_bigger_or_9_99.py

Status: PENDING

Description: Checks comparison accuracy for decimal values.


accuracy_team.py

Status: PENDING

Description: Evaluates team routing accuracy for language handling.


accuracy_with_given_answer.py

Status: PENDING

Description: Scores a manually provided answer against expected output.


accuracy_with_tools.py

Status: PENDING

Description: Evaluates accuracy for factorial tool usage.


db_logging.py

Status: PENDING

Description: Runs accuracy evaluation and stores results in PostgreSQL.


evaluator_agent.py

Status: PENDING

Description: Uses a custom evaluator agent for accuracy scoring.


accuracy_eval_metrics.py

Status: PASS

Description: Eval model metrics accumulated into agent run_output under "eval_model" detail key.

Result: Shows agent "model" tokens and "eval_model" tokens separately in metrics.details with full breakdown.