Back to Kata Containers

Pod Annotations

docs/pod-annotations.md

3.30.015.4 KB
Original Source

Pod Annotations

Kata Containers gives users freedom to customize at per-pod level, by setting a wide range of Kata specific annotations in the pod specification.

Some annotations may be restricted by the configuration file for security reasons, notably annotations that could lead the runtime to execute programs on the host. Such annotations are marked with (R) in the tables below.

Kata Configuration Annotations

There are several kinds of Kata configurations and they are listed below.

Global Options

KeyValue TypeComments
io.katacontainers.config_pathstringKata config file location that overrides the default config paths
io.katacontainers.pkg.oci.bundle_pathstringOCI bundle path
io.katacontainers.pkg.oci.container_typestringOCI container type. Only accepts pod_container and pod_sandbox

Runtime Options

KeyValue TypeComments
io.katacontainers.config.runtime.experimentalbooleandetermines if experimental features enabled
io.katacontainers.config.runtime.disable_guest_seccompbooleandetermines if seccomp should be applied inside guest
io.katacontainers.config.runtime.disable_new_netnsbooleandetermines if a new netns is created for the hypervisor process
io.katacontainers.config.runtime.internetworking_modelstringdetermines how the VM should be connected to the container network interface. Valid values are macvtap, tcfilter and none
io.katacontainers.config.runtime.sandbox_cgroup_onlybooleandetermines if Kata processes are managed only in sandbox cgroup
io.katacontainers.config.runtime.enable_pprofbooleanenables Golang pprof for containerd-shim-kata-v2 process
io.katacontainers.config.runtime.create_container_timeoutuint64the timeout for create a container in seconds, default is 60
io.katacontainers.config.runtime.experimental_force_guest_pullbooleanforces the runtime to pull the image in the guest VM, default is false. This is an experimental feature and might be removed in the future.

Agent Options

KeyValue TypeComments
io.katacontainers.config.agent.enable_tracingbooleanenable tracing for the agent
io.katacontainers.config.agent.container_pipe_sizeuint32specify the size of the std(in/out) pipes created for containers
io.katacontainers.config.agent.kernel_modulesstringthe list of kernel modules and their parameters that will be loaded in the guest kernel. Semicolon separated list of kernel modules and their parameters. These modules will be loaded in the guest kernel using modprobe(8). E.g., e1000e InterruptThrottleRate=3000,3000,3000 EEE=1; i915 enable_ppgtt=0
io.katacontainers.config.agent.cdh_api_timeoutuint32timeout in second for Confidential Data Hub (CDH) API service, default is 50

Hypervisor Options

Hypervisor annotations must be explicitly whitelisted in the Kata runtime config. Example:

toml
# List of valid annotation names for the hypervisor
enable_annotations = ["enable_iommu", "virtio_fs_extra_args", "kernel_params"]
KeyValue TypeComments
io.katacontainers.config.hypervisor.asset_hash_typestringthe hash type used for assets verification, default is sha512
io.katacontainers.config.hypervisor.block_device_cache_directbooleanDenotes whether use of O_DIRECT (bypass the host page cache) is enabled
io.katacontainers.config.hypervisor.block_device_cache_noflushbooleanDenotes whether flush requests for the device are ignored
io.katacontainers.config.hypervisor.block_device_cache_setbooleancache-related options will be set to block devices or not
io.katacontainers.config.hypervisor.block_device_driverstringthe driver to be used for block device, valid values are virtio-blk, virtio-scsi, nvdimm
io.katacontainers.config.hypervisor.blk_logical_sector_sizeuint32logical sector size in bytes reported by block devices to the guest (0 = hypervisor default, must be a power of 2 between 512 and 65536)
io.katacontainers.config.hypervisor.blk_physical_sector_sizeuint32physical sector size in bytes reported by block devices to the guest (0 = hypervisor default, must be a power of 2 between 512 and 65536)
io.katacontainers.config.hypervisor.cpu_featuresstringComma-separated list of CPU features to pass to the CPU (QEMU)
io.katacontainers.config.hypervisor.default_max_vcpusuint32the maximum number of vCPUs allocated for the VM by the hypervisor
io.katacontainers.config.hypervisor.default_memoryuint32the memory assigned for a VM by the hypervisor in MiB
io.katacontainers.config.hypervisor.default_vcpusfloat32the default vCPUs assigned for a VM by the hypervisor
io.katacontainers.config.hypervisor.disable_block_device_usebooleandisable hotplugging host block devices to guest VMs for container rootfs
io.katacontainers.config.hypervisor.disable_image_nvdimmbooleanspecify if a nvdimm device should be used as rootfs for the guest (QEMU)
io.katacontainers.config.hypervisor.disable_vhost_netbooleanspecify if vhost-net is not available on the host
io.katacontainers.config.hypervisor.enable_hugepagesbooleanif the memory should be pre-allocated from huge pages
io.katacontainers.config.hypervisor.enable_iommu_platformbooleanenable iommu on CCW devices (QEMU s390x)
io.katacontainers.config.hypervisor.enable_iommubooleanenable iommu on Q35 (QEMU x86_64)
io.katacontainers.config.hypervisor.enable_iothreadsbooleanenable IO to be processed in a separate thread. Supported currently for virtio-scsi driver
io.katacontainers.config.hypervisor.enable_mem_preallocbooleanthe memory space used for nvdimm device by the hypervisor
io.katacontainers.config.hypervisor.enable_vhost_user_storebooleanenable vhost-user storage device (QEMU)
io.katacontainers.config.hypervisor.vhost_user_reconnect_timeout_secstringthe timeout for reconnecting vhost user socket (QEMU)
io.katacontainers.config.hypervisor.enable_virtio_membooleanenable virtio-mem (QEMU)
io.katacontainers.config.hypervisor.entropy_source (R)stringthe path to a host source of entropy (/dev/random, /dev/urandom or real hardware RNG device)
io.katacontainers.config.hypervisor.file_mem_backend (R)stringfile based memory backend root directory
io.katacontainers.config.hypervisor.firmware_hashstringcontainer firmware SHA-512 hash value
io.katacontainers.config.hypervisor.firmwarestringthe guest firmware that will run the container VM
io.katacontainers.config.hypervisor.firmware_volume_hashstringcontainer firmware volume SHA-512 hash value
io.katacontainers.config.hypervisor.firmware_volumestringthe guest firmware volume that will be passed to the container VM
io.katacontainers.config.hypervisor.guest_hook_pathstringthe path within the VM that will be used for drop in hooks
io.katacontainers.config.hypervisor.hotplug_vfio_on_root_busbooleanindicate if devices need to be hotplugged on the root bus instead of a bridge
io.katacontainers.config.hypervisor.hypervisor_hashstringcontainer hypervisor binary SHA-512 hash value
io.katacontainers.config.hypervisor.image_hashstringcontainer guest image SHA-512 hash value
io.katacontainers.config.hypervisor.imagestringthe guest image that will run in the container VM
io.katacontainers.config.hypervisor.initrd_hashstringcontainer guest initrd SHA-512 hash value
io.katacontainers.config.hypervisor.initrdstringthe guest initrd image that will run in the container VM
io.katacontainers.config.hypervisor.jailer_hashstringcontainer jailer SHA-512 hash value
io.katacontainers.config.hypervisor.jailer_path (R)stringthe jailer that will constrain the container VM
io.katacontainers.config.hypervisor.kernel_hashstringcontainer kernel image SHA-512 hash value
io.katacontainers.config.hypervisor.kernel_paramsstringadditional guest kernel parameters
io.katacontainers.config.hypervisor.kernelstringthe kernel used to boot the container VM
io.katacontainers.config.hypervisor.machine_acceleratorsstringmachine specific accelerators for the hypervisor
io.katacontainers.config.hypervisor.machine_typestringthe type of machine being emulated by the hypervisor
io.katacontainers.config.hypervisor.memory_offsetuint64the memory space used for nvdimm device by the hypervisor
io.katacontainers.config.hypervisor.memory_slotsuint32the memory slots assigned to the VM by the hypervisor
io.katacontainers.config.hypervisor.msize_9puint32the msize for 9p shares
io.katacontainers.config.hypervisor.pathstringthe hypervisor that will run the container VM. The path must be whitelisted in the runtime configuration's valid_hypervisor_paths parameter.
io.katacontainers.config.hypervisor.pcie_root_portspecify the number of PCIe Root Port devices. The PCIe Root Port device is used to hot-plug a PCIe device (QEMU)
io.katacontainers.config.hypervisor.shared_fsstringthe shared file system type, either virtio-9p or virtio-fs
io.katacontainers.config.hypervisor.use_vsockbooleanspecify use of vsock for agent communication
io.katacontainers.config.hypervisor.vhost_user_store_path (R)stringspecify the directory path where vhost-user devices related folders, sockets and device nodes should be (QEMU)
io.katacontainers.config.hypervisor.virtio_fs_cache_sizeuint32virtio-fs DAX cache size in MiB
io.katacontainers.config.hypervisor.virtio_fs_cachestringthe cache mode for virtio-fs, valid values are always, auto and never
io.katacontainers.config.hypervisor.virtio_fs_daemonstringvirtio-fs vhost-user daemon path
io.katacontainers.config.hypervisor.virtio_fs_extra_argsstringextra options passed to virtiofs daemon
io.katacontainers.config.hypervisor.enable_guest_swapbooleanenable swap in the guest
io.katacontainers.config.hypervisor.use_legacy_serialbooleanuses legacy serial device for guest's console (QEMU)
io.katacontainers.config.hypervisor.default_gpusuint32the minimum number of GPUs required for the VM. Only used by remote hypervisor to help with instance selection
io.katacontainers.config.hypervisor.default_gpu_modelstringthe GPU model required for the VM. Only used by remote hypervisor to help with instance selection
io.katacontainers.config.hypervisor.block_device_num_queuesusizeThe number of queues to use for block devices (runtime-rs only)
io.katacontainers.config.hypervisor.block_device_queue_sizeuint32The size of the of the queue to use for block devices (runtime-rs only)

Container Options

KeyValue TypeComments
io.katacontainers.container.resource.swappiness"uint64specify the Resources.Memory.Swappiness
io.katacontainers.container.resource.swap_in_bytes"uint64specify the Resources.Memory.Swap

containerd Configuration

For containerd, annotations specified in the pod spec are passed down to Kata starting with version 1.3.0 of containerd. Additionally, extra configuration is needed for containerd, by providing pod_annotations field and container_annotations field in the containerd config file. The pod_annotations field and container_annotations field are two lists of annotations that can be passed down to Kata as OCI annotations. They support golang match patterns. Since annotations supported by Kata follow the pattern io.katacontainers.*, the following configuration would work for passing annotations to Kata from containerd:

$ cat /etc/containerd/config
....

         [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.kata]
           runtime_type = "io.containerd.kata.v2"
           pod_annotations = ["io.katacontainers.*"]
           container_annotations = ["io.katacontainers.*"]
....

Additional documentation on the above configuration can be found in the containerd docs.

Example

As mentioned above, not all containers need the same modules, therefore using the configuration file for specifying the list of kernel modules per POD can be a pain. Unlike the configuration file, annotations provide a way to specify custom configurations per POD.

The list of kernel modules and parameters can be set using the annotation io.katacontainers.config.agent.kernel_modules as a semicolon separated list, where the first word of each element is considered as the module name and the rest as its parameters.

Also users might want to enable guest seccomp to provide better isolation with a little performance sacrifice. The annotation io.katacontainers.config.runtime.disable_guest_seccomp can used for such purpose.

In the following example two PODs are created, but the kernel modules e1000e and i915 are inserted only in the POD pod1. Also guest seccomp is only enabled in the POD pod2.

yaml
apiVersion: v1
kind: Pod
metadata:
  name: pod1
  annotations:
    io.katacontainers.config.agent.kernel_modules: "e1000e EEE=1; i915"
spec:
  runtimeClassName: kata
  containers:
  - name: c1
    image: busybox
    command:
      - sh
    stdin: true
    tty: true

---
apiVersion: v1
kind: Pod
metadata:
  name: pod2
  annotations:
    io.katacontainers.config.runtime.disable_guest_seccomp: "false"
spec:
  runtimeClassName: kata
  containers:
  - name: c2
    image: busybox
    command:
      - sh
    stdin: true
    tty: true

Restricted annotations

Some annotations are restricted, meaning that the configuration file specifies the acceptable values. Currently, only hypervisor annotations are restricted, for security reason, with the intent to control which binaries the Kata Containers runtime will launch on your behalf.

The configuration file validates the annotation name as well as the annotation value.

The acceptable annotation names are defined by the enable_annotations entry in the configuration file.

For restricted annotations, an additional configuration entry provides a list of acceptable values. Since most restricted annotations are intended to control which binaries the runtime can execute, the valid value is generally provided by a shell pattern, as defined by glob(3). The table below provides the name of the configuration entry:

KeyConfig file entryComments
entropy_sourcevalid_entropy_sourcesValid entropy sources, e.g. /dev/random
file_mem_backendvalid_file_mem_backendsValid locations for the file-based memory backend root directory
jailer_pathvalid_jailer_pathsValid paths for the jailer constraining the container VM (Firecracker)
pathvalid_hypervisor_pathsValid hypervisors to run the container VM
vhost_user_store_pathvalid_vhost_user_store_pathsValid paths for vhost-user related files
virtio_fs_daemonvalid_virtio_fs_daemon_pathsValid paths for the virtiofsd daemon