rust/sdk/examples/multi-codebase-summarization/README.md
Rust equivalent of the Python multi_codebase_summarization example.
Scans subdirectories of a root folder (each treated as a Python project), uses an LLM to extract structured info (public classes, functions, CocoIndex pipeline graphs), aggregates into project-level summaries, and outputs markdown documentation.
export LLM_API_KEY="sk-..."
# Optional overrides:
export LLM_MODEL="gpt-4o-mini" # default
export LLM_BASE_URL="https://api.openai.com/v1" # default
cd rust/sdk/examples/multi-codebase-summarization
cargo build --release
Summarize all Python examples in this repo:
cargo run -- ../../../../examples ./output
This will:
../../../../examples as a project*.py and **/*.py files in each projectoutput/<project_name>.md for each project and remove stale markdown for deleted projectsRe-run — unchanged files and unchanged project summaries are cached (memoized in LMDB), so a fully warm rerun makes zero LLM calls:
cargo run -- ../../../../examples ./output
# Much faster — skips files and project summaries already analyzed
This example demonstrates:
| Macro | Purpose |
|---|---|
#[cocoindex::function(memo)] | extract_file_info — LLM call cached per file fingerprint |
#[cocoindex::function(memo)] | aggregate_project_info — LLM call cached until any file summary changes |
#[cocoindex::function] | generate_markdown — pure transform, no caching needed |
ctx.mount_each(...) | Process all files concurrently within each project |
OnceLock<LlmClient> | Access a process-wide shared LLM client |