Back to Agno

Data labeling

cookbook/data_labeling/README.md

2.6.83.9 KB
Original Source

Data labeling

End-to-end examples for data classification and labeling using agents.

Each subfolder holds examples for one theme, containing a basic.py that runs end-to-end, plus variants that add task-meaningful options on top.

Workflows are organized by modality (text, image, audio, video, document) and output shape (classify, extract, rank, span-label). Two further patterns (_17_llm_as_judge, _18_quality_review) compose on top of any of these.

Start with _01_text_classification/basic.py. Every other cookbook mirrors its structure.

Layout

cookbook/data_labeling/
├── README.md
├── <workflow>/
│   ├── README.md
│   ├── basic.py            # smallest readable example
│   ├── <variant>.py        # one file per task-meaningful variant
│   ├── schemas.py          # shared Pydantic types, if any
│   ├── data/               # sample inputs or dataset pointers
│   └── TEST_LOG.md         # run log per the cookbook convention
└── ...

Workflows

Text

Image

Audio

Video

Document

Composed patterns

These layer on top of any modality.

  • _17_llm_as_judge/: score outputs against a rubric. The same machinery as labeling, repurposed for evals.
  • _18_quality_review/: labeler, reviewer, adjudicator pipeline applied on top of an extraction primitive.

Running a cookbook

From the agno repo root, create and activate the demo venv:

bash
./scripts/demo_setup.sh
bash
source .venvs/demo/bin/activate
bash
python cookbook/data_labeling/_01_text_classification/basic.py

Each subfolder's README.md documents its inputs, the model it expects, and any extra dependencies.

VariableUsed by
OPENAI_API_KEYDefault for text and most extraction cookbooks
ANTHROPIC_API_KEYImage and document cookbooks where Claude is the picked model
GOOGLE_API_KEYAudio and video cookbooks (Gemini)

The per-cookbook README calls out which model it uses and why.