book/docs/DEVELOPMENT.md
This guide covers the development workflow, automated cleanup system, and best practices for contributing to the Machine Learning Systems book.
./binder clean # Clean build artifacts
./binder build # Build HTML book
./binder doctor # Health check & diagnostics
./binder preview # Live preview with hot reload
./binder build pdf # Build PDF
# First time setup
./binder setup # Configure environment and tools
# Daily workflow (most common commands)
./binder clean # Clean build artifacts
./binder build # Build HTML (complete book)
./binder doctor # Health check
# Preview & development
./binder preview intro # Preview a chapter with live reload
./binder build intro # Build specific chapter
This project includes an automated cleanup system that runs before every commit to ensure a clean repository.
The cleanup system removes:
*.html, *.pdf, *.tex, *.aux, *.log, *.toc.quarto/, site_libs/, index_files/ (legacy)__pycache__/, *.pyc, *.pyo.DS_Store, Thumbs.db, *.swp*~, .#*debug.log, error.log# Regular cleanup (recommended before commits)
./binder clean
# See what files will be cleaned (safe preview)
git status
git clean -xdn
# Deep clean (removes all build artifacts)
./binder clean
git clean -xdf
The git pre-commit hook automatically:
# Only if absolutely necessary
git commit --no-verify -m "Emergency commit"
# Using binder (recommended)
./binder build html # Build HTML version
./binder build pdf # Build PDF version
./binder publish # Build and publish
# Using binder (recommended)
./binder build # HTML version
./binder build pdf # PDF version
./binder build epub # EPUB version
# Preview a chapter (fastest)
./binder preview intro
# Build complete book
./binder build html
# Publish to the world
./binder publish
The ./binder setup command provides a complete environment configuration:
What it does:
Features:
# Run setup
./binder setup
# Get welcome and overview
./binder hello
# Start live preview server
./binder preview
# The server will automatically reload when you save changes
build/html/index.html (main output directory)build/pdf/ (PDF output directory)book/index.pdf (in book directory)The ./binder publish command provides a complete publishing workflow:
Step-by-step process:
dev to main with confirmationFeatures:
# One-command publishing
./binder publish
If you prefer to do it step by step:
# 1. Ensure you're on main branch
git checkout main
git merge dev
# 2. Build both formats
./binder build html
./binder build pdf
# 3. Copy PDF to assets
cp build/pdf/Machine-Learning-Systems.pdf assets/
# 4. Commit and push
git add assets/downloads/Machine-Learning-Systems.pdf
git commit -m "Add PDF to assets"
git push origin main
main branchThe GitHub Actions workflow will:
Monitor progress: https://github.com/harvard-edge/cs249r_book/actions
./binder doctor # Overall project health
./binder status # Detailed project status
git status # Git repository status
./binder doctor # Run comprehensive health check
quarto check # Validate Quarto configuration
๐ Checking project health...
๐ Project Structure:
QMD files: 45
Bibliography files: 20
Quiz files: 18
๐๏ธ Git Status:
Repository is clean
๐ฆ Dependencies:
โ
Quarto: 1.4.x
โ
Python: 3.x
book/contents/
โโโ core/ # Main content chapters
โ โโโ introduction/
โ โ โโโ introduction.qmd
โ โ โโโ introduction.bib
โ โ โโโ introduction_quizzes.json
โ โโโ ...
โโโ frontmatter/ # Preface, about, etc.
โโโ backmatter/ # References, appendices
โโโ labs/ # Hands-on exercises
For faster development, you can work with a minimal set of chapters:
book/_quarto-html.yml: Comment out chapters you're not working on.bib fileschapters:
- index.qmd
- contents/core/introduction/introduction.qmd
# - contents/core/ml_systems/ml_systems.qmd # Commented out
# - contents/core/nn_computation/nn_computation.qmd # Commented out
Simply uncomment the chapters and bibliography entries you want to restore.
Build fails with missing files
make clean # Clean artifacts
make check # Verify structure
Git hook blocks commit
make clean # Remove artifacts
git status # Check what's staged
Slow builds
make clean-deep # Full cleanup
# Use minimal configuration
Permission denied on scripts
make setup-hooks # Fix permissions
./binder help # Show all commands
./binder --help # Detailed help
git pull # Get latest changes
./binder clean # Clean workspace
./binder doctor # Verify health
# 1. Clean and build
./binder clean
./binder build
# 2. Start development server
./binder preview
# 3. Make changes to .qmd files
# 4. Preview updates automatically
# 5. When ready to commit
git add . # Pre-commit hook runs automatically
git commit -m "Your message"
./binder clean # Full cleanup
./binder build # Clean build
./binder doctor # Run all checks
./binder doctor # Comprehensive validation
./binder build # Build HTML
./binder build pdf # Build PDF
./binder build epub # Build EPUB
quarto/config/_quarto-html.yml: HTML website configurationquarto/config/_quarto-pdf.yml: PDF book configurationbinder: Book Binder CLI (build and development tool).git/hooks/pre-commit: Automated cleanup hook.gitignore: Ignored file patternsThe tools/scripts/ directory is organized into logical categories:
tools/scripts/
โโโ build/ # Build and development scripts (clean.sh, etc.)
โโโ content/ # Content management tools
โโโ maintenance/ # System maintenance scripts
โโโ testing/ # Test and validation scripts
โโโ utilities/ # General utility scripts
โโโ docs/ # Script documentation
โโโ genai/ # AI and generation tools
โโโ cross_refs/ # Cross-reference management
โโโ quarto_publish/ # Publishing workflows
โโโ ai_menu/ # AI menu tools
Each directory has its own README.md with specific usage instructions.
make setup-hooks && make installmake test && make build-allThe automated cleanup system ensures that your commits will be clean and won't include build artifacts, making code reviews easier and keeping the repository tidy.
If you encounter issues with the development workflow:
make check for diagnosticsmake clean-dry