NVIDIA GPU Driver and DevicePlugin Installation

Summary

This kops hook container may be used to enable nodes with GPUs to work with Kubernetes. It is targeted specifically for AWS GPU instance types.

It installs the following from web sources.

Using this hook indicates that you agree to the NVIDIA license.

How it works

This kops hook container runs on a kubernetes node upon every boot.
It installs onto the host system a systemd oneshot service unit nvidia-device-plugin.service along with setup scripts.
The systemd unit nvidia-device-plugin.service runs and executes the setup scripts in the host directory /nvidia-device-plugin.
The scripts install the NVIDIA device drivers, CUDA libs, nvidia-docker along with the matching version of docker-ce.
The scheduling of work in a separate systemd unit outside of this kops hook is required because it is not possible to upgrade docker-ce on the host from within a docker container.

Prerequisites

Although this hook may work among many combinatorial versions of software and images, it has only been tested with the following:

kops: 1.9, 1.15.0
kubernetes: 1.10, 1.11, 1.15.5
OS Image: kope.io/k8s-1.10-debian-stretch-amd64-hvm-ebs-2018-05-27, kope.io/k8s-1.15-debian-stretch-amd64-hvm-ebs-2020-01-17
- This is most certainly not the default image for kops. The OS image must be explicitly overridden in the cluster or instancegroup spec.
- Debian stretch is needed because nvidia-docker requires a newer version of docker-ce >= 18.0, which is not available in the Debian jessie package repository. In addition, the Debian jessie kernel was compiled with gcc-7, while the system packages install gcc-4, thus making the nvidia driver compilation fail.
cloud: AWS
- This hook will only work on AWS at this moment.
- This is due to the fact that it uses an AWS discovery mechanism to determine node instancetype, and subsequently install the correct drivers and configure the optimal settings for the GPU chipsets.

Test Matrix

This kOps hook was developed against the following version combinations.

Kops Version	Kubernetes Version	GPU Mode	OS Image
1.15.0	1.15.5	deviceplugin	kope.io/k8s-1.15-debian-stretch-amd64-hvm-ebs-2020-01-17
1.10-beta.1	1.10	deviceplugin	kope.io/k8s-1.10-debian-stretch-amd64-hvm-ebs-2018-05-27
1.9.1	1.11	deviceplugin	kope.io/k8s-1.10-debian-stretch-amd64-hvm-ebs-2018-05-27
1.9.1	1.10	legacy	kope.io/k8s-1.10-debian-stretch-amd64-hvm-ebs-2018-05-27

About the Docker Image

For CUDA 10.0, run DOCKER_REGISTRY= make image push with the desired registry to self-host the docker image.
For CUDA 9.1, the image is already hosted according to the InstanceGroup spec example but can be mirrored elsewhere.

Using this DevicePlugin

Create a Cluster with GPU Nodes

bash

kops create cluster \
  --name gpu.example.k8s.local \
  --zones us-east-1c \
  --node-size p2.xlarge \
  --node-count 1 \
  --image kope.io/k8s-1.15-debian-stretch-amd64-hvm-ebs-2020-01-17 \
  --kubernetes-version 1.15.5

Enable the Kops Installation Hook and DevicePlugins

This should be safe to do for all machines, because the hook auto-detects if the machine is an AWS GPU instancetype and will NO-OP otherwise. Choose between the DevicePlugin GPU Mode or Legacy Accelerators GPU Mode.

(Preferred) DevicePlugin GPU Mode

This mode is:

Required for kubernetes >= 1.11.0
Optional for 1.8.0 =< kubernetes <= 1.11.0

For Kubernetes >= 1.11.0 or clusters supporting DevicePlugins

yaml

# > kops edit instancegroup nodes
# CUDA 10.0
spec:
  image: kope.io/k8s-1.15-debian-stretch-amd64-hvm-ebs-2020-01-17
  hooks:
  - execContainer:
      image: DOCKER_REGISTRY/nvidia-device-plugin:0.2.0-cuda10.0 # Replace DOCKER_REGISTRY with the registry used to host the image

# CUDA 9.1
spec:
  image: kope.io/k8s-1.10-debian-stretch-amd64-hvm-ebs-2018-05-27
  hooks:
  - execContainer:
      image: dcwangmit01/nvidia-device-plugin:0.1.0

### The settings below are only necessary for kubernetes <= 1.11.0, where
###   deviceplugins are not enabled by default.
# kubelet:
#   featureGates:
#     # Enable DevicePlugins
#     DevicePlugins: "true"
#     # Disable Accelerators (may interfere with DevicePlugins)
#     Accelerators: "false"

(Deprecated) Legacy Accelerators GPU Mode

The legacy accelerator GPU mode is equivalent to the original GPU hook. Accelerators are deprecated in Kubernetes >= 1.11.0.

yaml

# > kops edit instancegroup nodes

spec:
  image: kope.io/k8s-1.10-debian-stretch-amd64-hvm-ebs-2018-05-27
  hooks:
  - execContainer:
      image: dcwangmit01/nvidia-device-plugin:0.1.0
      environment:
        NVIDIA_DEVICE_PLUGIN_MODE: legacy
  kubelet:
    featureGates:
      # Disable DevicePlugins (may interfere with DevicePlugins)
      DevicePlugins: "false"
      # Enable Accelerators
      Accelerators: "true"

Update the cluster

bash

kops update cluster gpu.example.k8s.local --yes
kops rolling-update cluster gpu.example.k8s.local --yes

Deploy the Daemonset for the NVIDIA DevicePlugin

Only for DevicePlugin GPU Mode, load the deviceplugin daemonset for your specific environment. This is not required for the Legacy Accelerators GPU Mode.

bash

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta5/nvidia-device-plugin.yml

# (Optional) Set permissive toleration to allow daemonset to run anywhere.
#   By default this is permissive in case you have tainted your GPU nodes.
kubectl patch daemonset nvidia-device-plugin-daemonset --namespace kube-system \
  -p '{ "spec": { "template": { "spec": { "tolerations": [ { "operator": "Exists" } ] } } } }'

Validate that GPUs are Working

Deploy a Test Pod

bash

# CUDA 10.0
cat << EOF | kubectl apply -f -
apiVersion: v1
kind: Pod
metadata:
  name: tf-gpu
spec:
  containers:
  - name: gpu
    image: tensorflow/tensorflow:2.0.1-gpu
    command: [ "/bin/bash", "-ce", "tail -f /dev/null" ]
    # ^ From 2.0.0 onwards, CMD is not set
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        memory: 1024Mi
        nvidia.com/gpu: 1 # requesting 1 GPU
EOF

# CUDA 9.1
cat << EOF | kubectl create -f -
apiVersion: v1
kind: Pod
metadata:
  name: tf-gpu
spec:
  containers:
  - name: gpu
    image: tensorflow/tensorflow:1.9.0-gpu
    imagePullPolicy: IfNotPresent
    resources:
      limits:
        memory: 1024Mi
        # ^ Set memory in case default limits are set low
        nvidia.com/gpu: 1 # requesting 1 GPUs
        # ^ For Legacy Accelerators mode this key must be renamed
        #   'alpha.kubernetes.io/nvidia-gpu'
  tolerations:
  # This toleration will allow the gpu hook to run anywhere
  #   By default this is permissive in case you have tainted your GPU nodes.
  - operator: "Exists"
  # ^ If you have a specific taint to target, comment out the above and modify
  #   the example below

### Example tolerations
# - key: "dedicated"
#   operator: "Equal"
#   value: "gpu"
#   effect: "NoExecute"
EOF

Validate that GPUs are working

bash

# Check that nodes are detected to have GPUs
kubectl describe nodes | grep -E 'gpu:\s.*[1-9]'

# Check the logs of the Tensorflow Container to ensure that it ran
kubectl logs tf-gpu # no output since 2.0.0

# Show GPU info from within the pod
#   Only works in DevicePlugin mode
kubectl exec -it tf-gpu nvidia-smi

# Show Tensorflow detects GPUs from within the pod.
#   Only works in DevicePlugin mode
kubectl exec -it tf-gpu -- \
  python -c 'from tensorflow.python.client import device_lib; print(device_lib.list_local_devices())'

Teardown the Test Cluster

yaml

kubectl delete pod/tf-gpu
kubectl delete -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/1.0.0-beta5/nvidia-device-plugin.yml
kops delete cluster --name gpu.example.k8s.local --yes