examples/multi_codebase_summarization/README.md
Star š CocoIndex if you like it!!
</div>This example shows how to use instructor with Gemini to analyze multiple Python codebases and generate markdown documentation using CocoIndex v1.
CodebaseInfo model:
CodebaseInfo into a project-level summaryoutput/PROJECT_NAME.mdCodebaseInfo type for both file-level and project-level extraction@coco.fn decorated functions==>) for mount/use_mount callsThe generated markdown includes:
graph TD
%% App: SampleApp
app_main[<b>app_main</b>] ==> process_file[<b>process_file</b>]
process_file --> helper_func[helper_func]
Bold = @coco.fn, thick arrows (==>) = mount/use_mount calls
pip install -e .
Create a .env file in the example directory:
echo "GEMINI_API_KEY=your_api_key_here" > .env
Replace your_api_key_here with your actual Gemini API key.
Optionally, set a different LLM model:
echo "LLM_MODEL=gemini/gemini-2.5-flash" >> .env
Create a projects/ directory with subdirectories for each Python project:
projects/
āāā my_project_1/
ā āāā main.py
ā āāā utils.py
āāā my_project_2/
ā āāā app.py
āāā ...
cocoindex update main.py
This will:
projects/.py files (excluding .venv* directories)output/ls -la output/
cat output/my_project_1.md
Edit the app definition in main.py:
app = coco.App(
app_main,
coco.AppConfig(name="MultiCodebaseSummarization"),
root_dir=pathlib.Path("./your_projects_dir"),
output_dir=pathlib.Path("./your_output_dir"),
)
Set the LLM_MODEL environment variable to any LiteLLM-supported model:
# OpenAI
export LLM_MODEL=gpt-4o
# Anthropic
export LLM_MODEL=anthropic/claude-3-5-sonnet
# Local (Ollama)
export LLM_MODEL=ollama/llama3.2
graph TD
%% App: MultiCodebaseSummarization
app_main[<b>app_main</b>] ==> process_project[<b>process_project</b>]
process_project ==> extract_file_info[<b>extract_file_info</b>]
process_project ==> aggregate_project_info[<b>aggregate_project_info</b>]
process_project --> generate_markdown[generate_markdown]
process_project for eachCodebaseInfo from each fileCodebaseInfo into project-level CodebaseInfoCodebaseInfo to markdown and calls declare_file