architecture/05-build-system.md
The build system transforms Model Source (cog.yaml + predict.py + weights) into a production-ready OCI image containing the Container Runtime.
flowchart TB
subgraph input["Inputs"]
yaml["cog.yaml"]
code["predict.py"]
weights["weights"]
end
subgraph cli["CLI (pkg/cli/build.go)"]
parse["Parse Config"]
validate["Validate"]
end
subgraph generate["Dockerfile Generation (pkg/dockerfile/)"]
generator["Generator"]
baseimage["Base Image Selection"]
compat["Compatibility Matrix"]
wheel["Python Wheels
(SDK + Coglet)"]
end
subgraph docker["Docker Build"]
buildkit["Buildkit"]
image["Container Image"]
end
subgraph post["Post-Build"]
schema["Generate OpenAPI Schema"]
freeze["pip freeze"]
labels["Apply Labels"]
end
yaml --> parse --> validate
validate --> generator
compat --> generator
baseimage --> generator
wheel --> generator
generator -->|"Dockerfile"| buildkit
code --> buildkit
weights --> buildkit
buildkit --> image
image --> schema
image --> freeze
schema --> labels
freeze --> labels
labels -->|"Final Image"| output["Tagged Image"]
Reads cog.yaml and validates/completes the configuration:
flowchart LR
subgraph input ["cog.yaml (user provides)"]
direction TB
i1["gpu#colon; true"]
i2["python_packages#colon;\n - torch==2.1.0"]
end
subgraph output ["Config (completed)"]
direction TB
o1["gpu#colon; true"]
o2["cuda#colon; "12.1" ← auto-detected"]
o3["cudnn#colon; "8" ← auto-detected"]
end
input --> output
The generator produces a Dockerfile from the validated config.
# 1. Base image (cog-base, CUDA, or python-slim)
FROM r8.im/cog-base:cuda12.1-python3.11-torch2.1.0
# 2. System packages
RUN apt-get update && apt-get install -y ffmpeg
# 3. Python packages
RUN pip install -r requirements.txt
# 4. Cog SDK + coglet wheels (resolved at build time)
COPY cog-0.12.0-py3-none-any.whl /tmp/
RUN pip install /tmp/cog-0.12.0-py3-none-any.whl
COPY coglet-0.1.0-cp310-abi3-linux_x86_64.whl /tmp/
RUN pip install /tmp/coglet-0.1.0-cp310-abi3-linux_x86_64.whl
# 5. User run commands
RUN echo "custom setup"
# 6. Copy source
WORKDIR /src
COPY . /src
# 7. Entrypoint
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["python", "-m", "cog.server.http"]
PyTorch and TensorFlow releases are built against specific CUDA/cuDNN versions. The compatibility matrix captures these relationships from upstream release notes.
flowchart LR
subgraph input["User specifies"]
torch["torch==2.1.0"]
end
subgraph matrix["Compatibility Matrix"]
lookup["torch_compatibility_matrix.json"]
end
subgraph output["Cog determines"]
cuda["CUDA 12.1"]
cudnn["cuDNN 8"]
python["Python 3.10-3.13"]
end
torch --> lookup
lookup --> cuda
lookup --> cudnn
lookup --> python
Data files (embedded JSON, generated by tools/compatgen/):
pkg/config/torch_compatibility.json - PyTorch ↔ CUDA mappingspkg/config/tf_compatibility.json - TensorFlow ↔ CUDA mappingspkg/config/cuda_compatibility.json - CUDA base image mappingsThese are regenerated when new framework versions are released and embedded into the CLI binary at build time.
What it stores (for each framework release):
torch==2.1.0)Base image selection uses the compatibility matrix to find a pre-built image that matches the required Python/CUDA/PyTorch combination.
flowchart TD
start["Config has Python + CUDA + Torch versions"] --> lookup{"Matching cog-base
image exists?"}
lookup -->|"Yes"| cogbase["Use cog-base image
r8.im/cog-base:cuda12.1-python3.11-torch2.1.0"]
lookup -->|"No"| gpu{"GPU enabled?"}
gpu -->|"Yes"| cuda["Use NVIDIA CUDA image
nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04
(install Python + Torch in Dockerfile)"]
gpu -->|"No"| slim["Use Python slim image
python:3.11-slim"]
Pre-built images hosted at r8.im/cog-base with Python, CUDA, cuDNN, and PyTorch already installed.
r8.im/cog-base:cuda<version>-python<version>-torch<version>BaseImageConfigurations())When no matching cog-base exists (e.g., unusual version combination):
nvidia/cuda imagesTwo Python wheels are installed into every Cog image: the Cog SDK (type definitions, predictor base class) and coglet (Rust prediction server, PyO3 bindings). Wheels are resolved at Docker build time -- they're not embedded in the Go binary.
Resolution follows a 3-tier priority for each wheel:
| Priority | Source |
|---|---|
1. COG_SDK_WHEEL env var | URL, file path, or pypi:version |
2. build.sdk_version in cog.yaml | Version pin |
3. Auto-detect dist/cog-*.whl | Dev builds only |
| 4. Default | Install from PyPI |
| Priority | Source |
|---|---|
1. COGLET_WHEEL env var | URL or file path |
2. Auto-detect dist/coglet-*.whl | Dev builds only |
| 3. Default | Install from PyPI |
Local wheel files are copied into .cog/tmp/ in the Docker build context, then COPY'd and pip install'd in the Dockerfile.
After the main build, Cog:
| Label | Content |
|---|---|
run.cog.version | Cog CLI version |
run.cog.config | Serialized cog.yaml |
run.cog.openapi_schema | OpenAPI spec from type hints |
run.cog.pip_freeze | Installed package versions |
These labels can be fetched from a remote registry or local image store (like containerd) without pulling the full image. This allows tooling - both the Cog CLI during development and production infrastructure - to inspect model metadata and make decisions about how to run a model before booting it.
A built Cog image has layers in this order (bottom to top):
flowchart TB
copy["COPY . /src — User code + weights"] --- run
run["RUN commands (from cog.yaml) — Custom build steps"] --- pip
pip["pip install (python_packages) — Python dependencies"] --- wheel
wheel["Cog wheel install — Cog runtime"] --- apt
apt["apt-get install (system_packages) — System dependencies"] --- tini
tini["tini init — Process manager"] --- base
base["Base image (OS, Python, CUDA, cuDNN, PyTorch)\n~5-15 GB for GPU images"]
The base image is by far the largest layer. Using a matching cog-base image means this layer is shared across builds and doesn't need to be re-downloaded or rebuilt.
| Component | Location |
|---|---|
| CLI command | pkg/cli/build.go |
| Build orchestration | pkg/image/build.go |
| Dockerfile generator | pkg/dockerfile/standard_generator.go |
| Base image selection | pkg/dockerfile/base.go |
| Compatibility matrix | pkg/config/compatibility.go |
| Wheel resolution | pkg/wheels/wheels.go |
| Label definitions | pkg/docker/command/manifest.go |