Docker - Ollama — ContextQMD

CPU only

shell

docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Nvidia GPU

Install the NVIDIA Container Toolkit.

Install with Apt

Configure the repository

shell

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey \
    | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
curl -fsSL https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list \
    | sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' \
    | sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update

Install the NVIDIA Container Toolkit packages
shell
```
sudo apt-get install -y nvidia-container-toolkit
```

Install with Yum or Dnf

Configure the repository

shell

curl -fsSL https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo \
    | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo

Install the NVIDIA Container Toolkit packages
shell
```
sudo yum install -y nvidia-container-toolkit
```

Configure Docker to use Nvidia driver

shell

sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

Start the container

shell

docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

<Note> If you're running on an NVIDIA JetPack system, Ollama can't automatically discover the correct JetPack version. Pass the environment variable `JETSON_JETPACK=5` or `JETSON_JETPACK=6` to the container to select version 5 or 6. </Note>

AMD GPU

To run Ollama using Docker with AMD GPUs, use the rocm tag and the following command:

shell

docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama:rocm

Vulkan Support

Vulkan is bundled into the ollama/ollama image and is enabled by default when the container can access the GPU devices.

shell

docker run -d --device /dev/kfd --device /dev/dri -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Use OLLAMA_VULKAN=0 to disable Vulkan, or GGML_VK_VISIBLE_DEVICES=<ids> to select specific Vulkan devices.

Run model locally

Now you can run a model:

shell

docker exec -it ollama ollama run llama3.2

Try different models

More models can be found on the Ollama library.