Back to Cog

Build System

architecture/05-build-system.md

0.19.38.8 KB
Original Source

Build System

The build system transforms Model Source (cog.yaml + predict.py + weights) into a production-ready OCI image containing the Container Runtime.

Build Flow

mermaid
flowchart TB
    subgraph input["Inputs"]
        yaml["cog.yaml"]
        code["predict.py"]
        weights["weights"]
    end

    subgraph cli["CLI (pkg/cli/build.go)"]
        parse["Parse Config"]
        validate["Validate"]
    end

    subgraph generate["Dockerfile Generation (pkg/dockerfile/)"]
        generator["Generator"]
        baseimage["Base Image Selection"]
        compat["Compatibility Matrix"]
        wheel["Python Wheels
(SDK + Coglet)"]
    end

    subgraph docker["Docker Build"]
        buildkit["Buildkit"]
        image["Container Image"]
    end

    subgraph post["Post-Build"]
        schema["Generate OpenAPI Schema"]
        freeze["pip freeze"]
        labels["Apply Labels"]
    end

    yaml --> parse --> validate
    validate --> generator
    compat --> generator
    baseimage --> generator
    wheel --> generator
    generator -->|"Dockerfile"| buildkit
    code --> buildkit
    weights --> buildkit
    buildkit --> image
    image --> schema
    image --> freeze
    schema --> labels
    freeze --> labels
    labels -->|"Final Image"| output["Tagged Image"]

Key Components

1. Config Parsing & Validation

Reads cog.yaml and validates/completes the configuration:

  • Validates Python version (3.10-3.13)
  • Auto-detects CUDA version from PyTorch/TensorFlow requirements
  • Resolves package versions against compatibility matrix
mermaid
flowchart LR
    subgraph input ["cog.yaml (user provides)"]
        direction TB
        i1["gpu#colon; true"]
        i2["python_packages#colon;\n  - torch==2.1.0"]
    end

    subgraph output ["Config (completed)"]
        direction TB
        o1["gpu#colon; true"]
        o2["cuda#colon; "12.1" ← auto-detected"]
        o3["cudnn#colon; "8" ← auto-detected"]
    end

    input --> output

2. Dockerfile Generator

The generator produces a Dockerfile from the validated config.

Generated Dockerfile Sections

dockerfile
# 1. Base image (cog-base, CUDA, or python-slim)
FROM r8.im/cog-base:cuda12.1-python3.11-torch2.1.0

# 2. System packages
RUN apt-get update && apt-get install -y ffmpeg

# 3. Python packages
RUN pip install -r requirements.txt

# 4. Cog SDK + coglet wheels (resolved at build time)
COPY cog-0.12.0-py3-none-any.whl /tmp/
RUN pip install /tmp/cog-0.12.0-py3-none-any.whl
COPY coglet-0.1.0-cp310-abi3-linux_x86_64.whl /tmp/
RUN pip install /tmp/coglet-0.1.0-cp310-abi3-linux_x86_64.whl

# 5. User run commands
RUN echo "custom setup"

# 6. Copy source
WORKDIR /src
COPY . /src

# 7. Entrypoint
ENTRYPOINT ["/sbin/tini", "--"]
CMD ["python", "-m", "cog.server.http"]

3. Compatibility Matrix

PyTorch and TensorFlow releases are built against specific CUDA/cuDNN versions. The compatibility matrix captures these relationships from upstream release notes.

mermaid
flowchart LR
    subgraph input["User specifies"]
        torch["torch==2.1.0"]
    end

    subgraph matrix["Compatibility Matrix"]
        lookup["torch_compatibility_matrix.json"]
    end

    subgraph output["Cog determines"]
        cuda["CUDA 12.1"]
        cudnn["cuDNN 8"]
        python["Python 3.10-3.13"]
    end

    torch --> lookup
    lookup --> cuda
    lookup --> cudnn
    lookup --> python

Data files (embedded JSON, generated by tools/compatgen/):

  • pkg/config/torch_compatibility.json - PyTorch ↔ CUDA mappings
  • pkg/config/tf_compatibility.json - TensorFlow ↔ CUDA mappings
  • pkg/config/cuda_compatibility.json - CUDA base image mappings

These are regenerated when new framework versions are released and embedded into the CLI binary at build time.

What it stores (for each framework release):

  • Framework version (e.g., torch==2.1.0)
  • Compatible CUDA versions
  • Compatible cuDNN versions
  • Compatible Python versions
  • Package index URLs (for CUDA-specific wheels)

4. Base Image Selection

Base image selection uses the compatibility matrix to find a pre-built image that matches the required Python/CUDA/PyTorch combination.

mermaid
flowchart TD
    start["Config has Python + CUDA + Torch versions"] --> lookup{"Matching cog-base
image exists?"}
    lookup -->|"Yes"| cogbase["Use cog-base image
r8.im/cog-base:cuda12.1-python3.11-torch2.1.0"]
    lookup -->|"No"| gpu{"GPU enabled?"}
    gpu -->|"Yes"| cuda["Use NVIDIA CUDA image
nvidia/cuda:12.1.1-cudnn8-devel-ubuntu22.04
(install Python + Torch in Dockerfile)"]
    gpu -->|"No"| slim["Use Python slim image
python:3.11-slim"]

Cog Base Images

Pre-built images hosted at r8.im/cog-base with Python, CUDA, cuDNN, and PyTorch already installed.

  • Format: r8.im/cog-base:cuda<version>-python<version>-torch<version>
  • Generated from the compatibility matrix (BaseImageConfigurations())
  • Includes common system packages (ffmpeg, git, curl, etc.)
  • Faster builds since heavy dependencies are pre-installed

Fallback: NVIDIA CUDA Images

When no matching cog-base exists (e.g., unusual version combination):

  • Uses official nvidia/cuda images
  • Dockerfile installs Python via pyenv
  • Dockerfile installs PyTorch and other packages via pip
  • Slower builds but supports any valid combination

5. Python Wheels (SDK + Coglet)

Two Python wheels are installed into every Cog image: the Cog SDK (type definitions, predictor base class) and coglet (Rust prediction server, PyO3 bindings). Wheels are resolved at Docker build time -- they're not embedded in the Go binary.

Resolution follows a 3-tier priority for each wheel:

Cog SDK Wheel

PrioritySource
1. COG_SDK_WHEEL env varURL, file path, or pypi:version
2. build.sdk_version in cog.yamlVersion pin
3. Auto-detect dist/cog-*.whlDev builds only
4. DefaultInstall from PyPI

Coglet Wheel

PrioritySource
1. COGLET_WHEEL env varURL or file path
2. Auto-detect dist/coglet-*.whlDev builds only
3. DefaultInstall from PyPI

Local wheel files are copied into .cog/tmp/ in the Docker build context, then COPY'd and pip install'd in the Dockerfile.


6. Post-Build: Labels & Schema

After the main build, Cog:

  1. Runs the container to generate OpenAPI schema
  2. Runs pip freeze to capture installed packages
  3. Applies labels with metadata

Image Labels

LabelContent
run.cog.versionCog CLI version
run.cog.configSerialized cog.yaml
run.cog.openapi_schemaOpenAPI spec from type hints
run.cog.pip_freezeInstalled package versions

These labels can be fetched from a remote registry or local image store (like containerd) without pulling the full image. This allows tooling - both the Cog CLI during development and production infrastructure - to inspect model metadata and make decisions about how to run a model before booting it.


Image Layer Structure

A built Cog image has layers in this order (bottom to top):

mermaid
flowchart TB
    copy["COPY . /src — User code + weights"] --- run
    run["RUN commands (from cog.yaml) — Custom build steps"] --- pip
    pip["pip install (python_packages) — Python dependencies"] --- wheel
    wheel["Cog wheel install — Cog runtime"] --- apt
    apt["apt-get install (system_packages) — System dependencies"] --- tini
    tini["tini init — Process manager"] --- base
    base["Base image (OS, Python, CUDA, cuDNN, PyTorch)\n~5-15 GB for GPU images"]

The base image is by far the largest layer. Using a matching cog-base image means this layer is shared across builds and doesn't need to be re-downloaded or rebuilt.


Code Reference

ComponentLocation
CLI commandpkg/cli/build.go
Build orchestrationpkg/image/build.go
Dockerfile generatorpkg/dockerfile/standard_generator.go
Base image selectionpkg/dockerfile/base.go
Compatibility matrixpkg/config/compatibility.go
Wheel resolutionpkg/wheels/wheels.go
Label definitionspkg/docker/command/manifest.go