Build System

The build system transforms Model Source (cog.yaml + predict.py + weights) into a production-ready OCI image containing the Container Runtime.

Build Flow

mermaid

flowchart TB
    subgraph input["Inputs"]
        yaml["cog.yaml"]
        code["predict.py"]
        weights["weights"]
    end

    subgraph cli["CLI (pkg/cli/build.go)"]
        parse["Parse Config"]
        validate["Validate"]
    end

    subgraph generate["Dockerfile Generation (pkg/dockerfile/)"]
        generator["Generator"]
        baseimage["Base Image Selection"]
        compat["Compatibility Matrix"]
        wheel["Python Wheels
(SDK + Coglet)"]
    end

    subgraph docker["Docker Build"]
        buildkit["Buildkit"]
        image["Container Image"]
    end

    subgraph post["Post-Build"]
        schema["Generate OpenAPI Schema"]
        freeze["pip freeze"]
        labels["Apply Labels"]
    end

    yaml --> parse --> validate
    validate --> generator
    compat --> generator
    baseimage --> generator
    wheel --> generator
    generator -->|"Dockerfile"| buildkit
    code --> buildkit
    weights --> buildkit
    buildkit --> image
    image --> schema
    image --> freeze
    schema --> labels
    freeze --> labels
    labels -->|"Final Image"| output["Tagged Image"]

Key Components

1. Config Parsing & Validation

Reads cog.yaml and validates/completes the configuration:

Validates Python version (3.10-3.13)
Auto-detects CUDA version from PyTorch/TensorFlow requirements
Resolves package versions against compatibility matrix

mermaid

flowchart LR
    subgraph input ["cog.yaml (user provides)"]
        direction TB
        i1["gpu#colon; true"]
        i2["python_packages#colon;\n  - torch==2.1.0"]
    end

    subgraph output ["Config (completed)"]
        direction TB
        o1["gpu#colon; true"]
        o2["cuda#colon; &quot;12.1&quot; ← auto-detected"]
        o3["cudnn#colon; &quot;8&quot; ← auto-detected"]
    end

    input --> output

2. Dockerfile Generator

The generator produces a Dockerfile from the validated config.

Generated Dockerfile Sections

dockerfile

# 1. Base image (cog-base, CUDA, or python-slim)
FROM r8.im/cog-base:cuda12.1-python3.11-torch2.1.0

# 2. System packages
RUN apt-get update && apt-get install -y ffmpeg

# 3. Python packages
RUN pip install -r requirements.txt

# 4. Cog SDK + coglet wheels (resolved at build time)
COPY cog-0.12.0-py3-none-any.whl /tmp/
RUN pip install /tmp/cog-0.12.0-py3-none-any.whl
COPY coglet-0.1.0-cp310-abi3-linux_x86_64.whl /tmp/
RUN pip install /tmp/coglet-0.1.0-cp310-abi3-linux_x86_64.whl

# 5. User run commands
RUN echo "custom setup"

# 6. Copy source
WORKDIR /src
COPY . /src

# 7. Entrypoint
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["python", "-m", "cog.server.http"]

3. Compatibility Matrix

PyTorch and TensorFlow releases are built against specific CUDA/cuDNN versions. The compatibility matrix captures these relationships from upstream release notes.

mermaid

flowchart LR
    subgraph input["User specifies"]
        torch["torch==2.1.0"]
    end

    subgraph matrix["Compatibility Matrix"]
        lookup["torch_compatibility_matrix.json"]
    end

    subgraph output["Cog determines"]
        cuda["CUDA 12.1"]
        cudnn["cuDNN 8"]
        python["Python 3.10-3.13"]
    end

    torch --> lookup
    lookup --> cuda
    lookup --> cudnn
    lookup --> python

Data files (embedded JSON, generated by tools/compatgen/):

pkg/config/torch_compatibility.json - PyTorch ↔ CUDA mappings
pkg/config/tf_compatibility.json - TensorFlow ↔ CUDA mappings
pkg/config/cuda_compatibility.json - CUDA base image mappings

These are regenerated when new framework versions are released and embedded into the CLI binary at build time.

What it stores (for each framework release):

Framework version (e.g., torch==2.1.0)
Compatible CUDA versions
Compatible cuDNN versions
Compatible Python versions
Package index URLs (for CUDA-specific wheels)

4. Base Image Selection

Base image selection uses the compatibility matrix to find a pre-built image that matches the required Python/CUDA/PyTorch combination.

mermaid

flowchart TD
    start["Config has Python + CUDA + Torch versions"] --> lookup{"Matching cog-base
image exists?"}
    lookup -->|"Yes"| cogbase["Use cog-base image
r8.im/cog-base:cuda12.1-python3.11-torch2.1.0"]
    lookup -->|"No"| gpu{"GPU enabled?"}
    gpu -->|"Yes"| cuda["Use NVIDIA CUDA image
nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04
(install Python + Torch in Dockerfile)"]
    gpu -->|"No"| slim["Use Python slim image
python:3.11-slim"]

Cog Base Images

Pre-built images hosted at r8.im/cog-base with Python, CUDA, cuDNN, and PyTorch already installed.

Format: r8.im/cog-base:cuda<version>-python<version>-torch<version>
Generated from the compatibility matrix (BaseImageConfigurations())
Includes common system packages (ffmpeg, git, curl, etc.)
Faster builds since heavy dependencies are pre-installed

Fallback: NVIDIA CUDA Images

When no matching cog-base exists (e.g., unusual version combination):

Uses official nvidia/cuda images
Dockerfile installs Python via pyenv
Dockerfile installs PyTorch and other packages via pip
Slower builds but supports any valid combination

5. Python Wheels (SDK + Coglet)

Two Python wheels are installed into every Cog image: the Cog SDK (type definitions, predictor base class) and coglet (Rust prediction server, PyO3 bindings). Wheels are resolved at Docker build time -- they're not embedded in the Go binary.

Resolution follows a 3-tier priority for each wheel:

Cog SDK Wheel

Priority	Source
1. `COG_SDK_WHEEL` env var	URL, file path, or `pypi:version`
2. `build.sdk_version` in `cog.yaml`	Version pin
3. Auto-detect `dist/cog-*.whl`	Dev builds only
4. Default	Install from PyPI

Coglet Wheel

Priority	Source
1. `COGLET_WHEEL` env var	URL or file path
2. Auto-detect `dist/coglet-*.whl`	Dev builds only
3. Default	Install from PyPI

Local wheel files are copied into .cog/tmp/ in the Docker build context, then COPY'd and pip install'd in the Dockerfile.

6. Post-Build: Labels & Schema

After the main build, Cog:

Runs the container to generate OpenAPI schema
Runs pip freeze to capture installed packages
Applies labels with metadata

Image Labels

Label	Content
`run.cog.version`	Cog CLI version
`run.cog.config`	Serialized cog.yaml
`run.cog.openapi_schema`	OpenAPI spec from type hints
`run.cog.pip_freeze`	Installed package versions

These labels can be fetched from a remote registry or local image store (like containerd) without pulling the full image. This allows tooling - both the Cog CLI during development and production infrastructure - to inspect model metadata and make decisions about how to run a model before booting it.

Image Layer Structure

A built Cog image has layers in this order (bottom to top):

mermaid

flowchart TB
    copy["COPY . /src — User code + weights"] --- run
    run["RUN commands (from cog.yaml) — Custom build steps"] --- pip
    pip["pip install (python_packages) — Python dependencies"] --- wheel
    wheel["Cog wheel install — Cog runtime"] --- apt
    apt["apt-get install (system_packages) — System dependencies"] --- tini
    tini["tini init — Process manager"] --- base
    base["Base image (OS, Python, CUDA, cuDNN, PyTorch)\n~5-15 GB for GPU images"]

The base image is by far the largest layer. Using a matching cog-base image means this layer is shared across builds and doesn't need to be re-downloaded or rebuilt.

Code Reference

Component	Location
CLI command	`pkg/cli/build.go`
Build orchestration	`pkg/image/build.go`
Dockerfile generator	`pkg/dockerfile/standard_generator.go`
Base image selection	`pkg/dockerfile/base.go`
Compatibility matrix	`pkg/config/compatibility.go`
Wheel resolution	`pkg/wheels/wheels.go`
Label definitions	`pkg/docker/command/manifest.go`