README.md
Principles and Practices of Engineering Artificially Intelligent Systems
<p align="center"> <a href="README.md">English</a> • <a href="README/README_zh.md">中文</a> • <a href="README/README_ja.md">日本語</a> • <a href="README/README_ko.md">한국어</a> </p> <div align="center"> <!-- Build Status --> <p align="center"> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/book-validate-dev.yml"></a> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/tinytorch-validate-dev.yml"></a> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/kits-preview-dev.yml"></a> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/labs-preview-dev.yml"></a> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/mlsysim-validate-dev.yml"></a> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/slides-validate-dev.yml"></a> <a href="https://github.com/harvard-edge/cs249r_book/actions/workflows/instructors-validate-dev.yml"></a> </p> <!-- Meta --> <p align="center"> <a href="https://github.com/harvard-edge/cs249r_book/blob/dev/LICENSE.md"></a> <a href="#citation--license"></a> <a href="https://opencollective.com/mlsysbook"></a> </p> <p align="center"> <b><a href="https://mlsysbook.ai">📘 Textbook (current edition)</a></b> • <b>📙 Vol I + Vol II <i>(Summer 2026)</i></b> • <b><a href="https://mlsysbook.ai/tinytorch/">🔥 TinyTorch</a></b> • <b><a href="mlsysim/README.md">🔮 MLSys·im <i>(dev)</i></a></b> • <b><a href="interviews/README.md">💼 Interview Playbook <i>(dev)</i></a></b> • <b><a href="https://mlsysbook.org">🌐 Ecosystem</a></b> </p> <p align="center">📚 <b>Hardcopy edition coming 2026 with MIT Press.</b></p> </div>That gap is what we mean by AI engineering.
AI engineering is the discipline of building efficient, reliable, safe, and robust intelligent systems that operate in the real world, not just models in isolation. Our mission is to establish AI engineering as a foundational discipline alongside software engineering and computer engineering, by teaching how to design, build, and evaluate end-to-end intelligent systems.
Our goal: Help 100,000 learners master ML Systems this year, and reach 1 million by 2030.
I designed this as a single integrated curriculum, not a collection of independent projects. The textbook teaches the theory. TinyTorch makes you build the internals. The hardware kits force you to confront real constraints. The simulator lets you reason about infrastructure you can't afford to rent. Each piece exists because I found that students who only read don't internalize, and students who only code don't generalize.
<div align="center"> <blockquote> <b>The repository is the curriculum.</b> </blockquote> </div>A growing community of contributors helps improve every part of it: fixing errors, sharpening explanations, testing on new hardware. Their work makes this better for everyone, and I'm grateful for every pull request.
Every component connects. The textbook gives you the mental models. The labs let you explore trade-offs interactively, powered by MLSys·im, the modeling engine for infrastructure you can't physically access. TinyTorch makes you build the machinery yourself. The hardware kits put you face-to-face with real constraints. The interview playbook tests whether you actually understand it. And the instructor hub, slides, and newsletter give educators everything they need to bring this into a classroom.
<p align="center"> </p>This textbook teaches you to think at the intersection of machine learning and systems engineering. Each chapter bridges algorithmic concepts with the infrastructure that makes them work in practice.
<table> <thead> <tr> <th width="45%">You know...</th> <th width="10%" align="center"></th> <th width="45%">You will learn...</th> </tr> </thead> <tbody> <tr> <td>How to train a model</td> <td align="center">→</td> <td><b>How training scales across GPU clusters</b></td> </tr> <tr> <td>That quantization shrinks models</td> <td align="center">→</td> <td><b>How INT8 math maps to silicon</b></td> </tr> <tr> <td>What a transformer is</td> <td align="center">→</td> <td><b>Why KV-cache dominates memory at inference</b></td> </tr> <tr> <td>Models run on GPUs</td> <td align="center">→</td> <td><b>How schedulers balance latency vs throughput</b></td> </tr> <tr> <td>Edge devices have limits</td> <td align="center">→</td> <td><b>How to co-design models and hardware</b></td> </tr> </tbody> </table>The textbook follows the Hennessy & Patterson pedagogical model across two volumes:
<table> <thead> <tr> <th width="5%"></th> <th width="15%">Volume</th> <th width="25%">Theme</th> <th width="55%">Scope</th> </tr> </thead> <tbody> <tr> <td align="center">📗</td> <td><b>Volume I</b></td> <td>Build, Optimize, Deploy</td> <td>Single-machine ML systems (1–8 GPUs). Foundations, optimization, and deployment on one node.</td> </tr> <tr> <td align="center">📘</td> <td><b>Volume II</b></td> <td>Scale, Distribute, Govern</td> <td>Distributed systems at production scale. Multi-machine infrastructure, fault tolerance, and governance.</td> </tr> </tbody> </table><table> <thead> <tr> <th width="5%"></th> <th width="15%">Branch</th> <th width="45%">What's on it</th> <th width="35%">Status</th> </tr> </thead> <tbody> <tr> <td align="center">🟢</td> <td><b><code>main</code></b> <a href="https://mlsysbook.ai">mlsysbook.ai</a></td> <td>Single-volume textbook (current edition)</td> <td>Live — this is what readers see today.</td> </tr> <tr> <td align="center">🟡</td> <td><b><code>dev</code></b> <i>← you are here</i></td> <td> <b>Volume I</b> — two-volume split (content complete, editorial polish)[!NOTE] You are on the
devbranch. Active development happens here. For the last stable release, see themainbranch.
<b>Volume II</b> — At Scale (active development)
<b>Curriculum</b> — TinyTorch, Kits, MLSys·im, Labs, Interview Playbook
</td>
<td>
TinyTorch and Hardware Kits are live.
MLSys·im, Labs, and Interview Playbook are in development.
</td>
</tr>
<a href="https://github.com/harvard-edge/cs249r_book/stargazers"></a> <a href="https://opencollective.com/mlsysbook"></a>
</div> <table> <tbody> <tr> <td width="50%" align="center"> <b>Star the repo</b> Stars signal to universities and foundations that this work matters. They directly fund workshops and hardware kits for underserved classrooms.
<a href="https://star-history.com/#harvard-edge/cs249r_book&Date"></a>
100 → 1,000 → <b>10,000</b> → 100,000 → <b>1M learners by 2030</b>
</td>
<td width="50%" align="center">
<b>Fund the mission</b>
All contributions go to <a href="https://opencollective.com/mlsysbook">Open Collective</a>, a transparent fund for educational outreach. Every dollar goes to reaching more students.
<a href="https://opencollective.com/mlsysbook"></a>
</td>
</tr>
Thanks goes to these wonderful people who have contributed to making this resource better for everyone!
Legend: 🪲 Bug Hunter · 🧑💻 Code Contributor · ✍️ Doc Wizard · 🎨 Design Artist · 🧠 Idea Spark · 🔎 Code Reviewer · 🧪 Test Tinkerer · 🛠️ Tool Builder
<b><a href="https://buttondown.email/mlsysbook">✉️ Subscribe</a> • <a href="https://github.com/harvard-edge/cs249r_book/discussions">💬 Join discussions</a> • <a href="https://mlsysbook.ai/">🌐 Visit mlsysbook.ai</a></b>
<b>Made with ❤️ for AI engineers</b>
<i>in the making, around the world</i> 🌎
</div>