book/quarto/contents/backmatter/glossary/README.md
This directory contains the comprehensive glossary system for the ML Systems textbook. The glossary provides definitions for 611+ technical terms used throughout the book, with automatic cross-references and interactive tooltips.
Chapter Glossaries ā Volume Glossaries ā Published Glossary Pages
(vol1: 16 files) (vol1_glossary.json) (vol1/glossary.qmd)
(vol2: 7 files) (vol2_glossary.json) (vol2/glossary.qmd)
ā ā ā
Source of Aggregated Volume-specific
truth & deduplicated user-facing pages
Each volume has its own self-contained glossary with no cross-volume dependencies.
quarto/contents/
āāā vol1/ # Volume 1 chapters
ā āāā introduction/
ā ā āāā introduction_glossary.json # Chapter-specific terms
ā āāā [... more vol1 chapters ...]
ā āāā backmatter/glossary/
ā āāā vol1_glossary.json # Vol1 aggregated terms
ā āāā glossary.qmd # Vol1 glossary page
ā
āāā vol2/ # Volume 2 chapters
ā āāā infrastructure/
ā ā āāā infrastructure_glossary.json
ā āāā [... more vol2 chapters ...]
ā āāā backmatter/glossary/
ā āāā vol2_glossary.json # Vol2 aggregated terms
ā āāā glossary.qmd # Vol2 glossary page
ā
āāā backmatter/glossary/
āāā README.md # This documentation
Each chapter has its own JSON glossary file containing terms specific to that chapter:
{
"metadata": {
"chapter": "introduction",
"total_terms": 27,
"generated_date": "2024-09-15"
},
"terms": [
{
"term": "artificial intelligence",
"definition": "A field of computer science focused on creating systems that can perform tasks typically requiring human intelligence...",
"chapter_source": "introduction",
"aliases": [],
"see_also": []
}
]
}
The global glossary aggregates all chapter terms, handles deduplication, and manages cross-chapter references:
{
"metadata": {
"type": "global_glossary",
"version": "1.0",
"total_terms": 611,
"source": "aggregated_from_chapter_glossaries"
},
"terms": [
{
"term": "artificial intelligence",
"definition": "A field of computer science focused on creating systems...",
"appears_in": ["introduction", "ml_systems", "responsible_ai"],
"chapter_source": "introduction",
"aliases": [],
"see_also": []
}
]
}
The final QMD file provides the user-facing glossary page with proper Quarto formatting and cross-references.
All processing scripts are located in tools/scripts/glossary/:
build_global_glossary.py - Main aggregation script
generate_glossary.py - Page generation script
smart_consolidation.py - Intelligent similarity detection
rule_based_consolidation.py - Academic best practices
consolidate_similar_terms.py - Manual consolidation rules
clean_global_glossary.py - Cleanup and validation
quarto/contents/core/[chapter]/[chapter]_glossary.json# 1. Aggregate chapter glossaries into global glossary
python3 tools/scripts/glossary/build_global_glossary.py
# 2. Generate the published glossary page
python3 tools/scripts/glossary/generate_glossary.py
# 3. Optional: Run quality assurance
python3 tools/scripts/glossary/smart_consolidation.py # Analysis only
python3 tools/scripts/glossary/rule_based_consolidation.py # Apply fixes
# Check for similar terms that need consolidation
python3 tools/scripts/glossary/smart_consolidation.py
# Apply academic best practices
python3 tools/scripts/glossary/rule_based_consolidation.py
# Rebuild after fixes
python3 tools/scripts/glossary/build_global_glossary.py
python3 tools/scripts/glossary/generate_glossary.py
The system automatically discovers actual section IDs from chapter files rather than relying on hardcoded mappings. This ensures cross-references always work correctly.
Terms appearing in multiple chapters are properly attributed:
The glossary integrates with the book through:
This usually means you're viewing the glossary in isolation. Cross-references only work in the full website build:
quarto render # Full website
# OR
quarto preview # Development server
smart_consolidation.pyquarto/contents/backmatter/glossary/Last Updated: September 2024 System Version: 1.0 Total Terms: 611 Coverage: Complete for all 22 chapters