doc/source/cluster/kubernetes/user-guides/config.md
(kuberay-config)=
This guide covers the key aspects of Ray cluster configuration on Kubernetes.
Deployments of Ray on Kubernetes follow the operator pattern. The key players are
RayCluster describing the desired state of a Ray cluster.RayCluster's spec.To deploy a Ray cluster, one creates a RayCluster custom resource (CR):
kubectl apply -f raycluster.yaml
This guide covers the salient features of RayCluster CR configuration.
For reference, here is a condensed example of a RayCluster CR in yaml format.
apiVersion: ray.io/v1alpha1
kind: RayCluster
metadata:
name: raycluster-complete
spec:
rayVersion: "2.3.0"
enableInTreeAutoscaling: true
autoscalerOptions:
...
headGroupSpec:
serviceType: ClusterIP # Options are ClusterIP, NodePort, and LoadBalancer
rayStartParams:
dashboard-host: "0.0.0.0"
...
template: # Pod template
metadata: # Pod metadata
spec: # Pod spec
containers:
- name: ray-head
image: rayproject/ray-ml:2.3.0
resources:
limits:
cpu: 14
memory: 54Gi
requests:
cpu: 14
memory: 54Gi
ports: # Optional service port overrides
- containerPort: 6379
name: gcs
- containerPort: 8265
name: dashboard
- containerPort: 10001
name: client
- containerPort: 8000
name: serve
...
workerGroupSpecs:
- groupName: small-group
replicas: 1
minReplicas: 1
maxReplicas: 5
rayStartParams:
...
template: # Pod template
spec:
...
# Another workerGroup
- groupName: medium-group
...
# Yet another workerGroup, with access to special hardware perhaps.
- groupName: gpu-group
...
The rest of this guide will discuss the RayCluster CR's config fields.
See also the guide on configuring Ray autoscaling with KubeRay.
(kuberay-config-ray-version)=
The field rayVersion specifies the version of Ray used in the Ray cluster.
The rayVersion is used to fill default values for certain config fields.
The Ray container images specified in the RayCluster CR should carry
the same Ray version as the CR's rayVersion. If you are using a nightly or development
Ray image, it is fine to set rayVersion to the latest release version of Ray.
At a high level, a RayCluster is a collection of Kubernetes pods, similar to a Kubernetes Deployment or StatefulSet. Just as with the Kubernetes built-ins, the key pieces of configuration are
The key difference between a Deployment and a RayCluster is that a RayCluster is
specialized for running Ray applications. A Ray cluster consists of
The head pod’s configuration is
specified under headGroupSpec, while configuration for worker pods is
specified under workerGroupSpecs. There may be multiple worker groups,
each group with its own configuration. The replicas field
of a workerGroupSpec specifies the number of worker pods of that group to
keep in the cluster. Each workerGroupSpec also has optional minReplicas and
maxReplicas fields; these fields are important if you wish to enable {ref}autoscaling <kuberay-autoscaling-config>.
The bulk of the configuration for a headGroupSpec or
workerGroupSpec goes in the template field. The template is a Kubernetes Pod
template which determines the configuration for the pods in the group.
Here are some of the subfields of the pod template to pay attention to:
A Ray pod template specifies at minimum one container, namely the container
that runs the Ray processes. A Ray pod template may also specify additional sidecar
containers, for purposes such as {ref}log processing <persist-kuberay-custom-resource-logs>. However, the KubeRay operator assumes that
the first container in the containers list is the main Ray container.
Therefore, make sure to specify any sidecar containers
after the main Ray container. In other words, the Ray container should be the first
in the containers list.
It's important to specify container CPU and memory resources for each group spec. Since CPU is a compressible resource, you may want to set only CPU requests and not limits to guarantee your workloads a minimum amount of CPU but allow them to take advantage of unused CPU and not get throttled if they use more than their requested CPU.
For GPU workloads, you may also wish to specify GPU
limits. For example, set nvidia.com/gpu: 2 if using an NVIDIA GPU device plugin
and you wish to specify a pod with access to 2 GPUs.
See {ref}this guide <kuberay-gpu> for more details on GPU support.
KubeRay automatically configures Ray to use the CPU, memory, and GPU limits in the Ray container
config. These values are the logical resource capacities of Ray pods in the head or
worker group. As of KubeRay 1.3.0, KubeRay uses the CPU request if the limit is absent.
KubeRay rounds up CPU quantities to the nearest integer. You can override these resource capacities
with {ref}rayStartParams. KubeRay ignores memory and GPU requests. So
set memory and GPU resource requests equal to their limits when possible
It's ideal to size each Ray pod to take up the entire Kubernetes node. In other words, it's best to run one large Ray pod per Kubernetes node. In general, it's more efficient to use a few large Ray pods than many small ones. The pattern of fewer large Ray pods has the following advantages:
You can control the scheduling of worker groups' Ray pods by setting the nodeSelector and
tolerations fields of the pod spec. Specifically, these fields determine on which Kubernetes
nodes the pods may be scheduled.
See the Kubernetes docs
for more about Pod-to-Node assignment.
The Ray container images specified in the RayCluster CR should carry
the same Ray version as the CR's spec.rayVersion.
If you are using a nightly or development Ray image, you can specify Ray's
latest release version under spec.rayVersion.
For Apple M1 or M2 MacBooks, see Use ARM-based docker images for Apple M1 or M2 MacBooks to specify the correct image.
You must install code dependencies for a given Ray task or actor on each Ray node that
might run the task or actor.
The simplest way to achieve this configuration is to use the same Ray image for the Ray head and all worker groups.
In any case, do make sure that all Ray images in your CR carry the same Ray version and
Python version.
To distribute custom code dependencies across your cluster, you can build a custom container image,
using one of the official Ray images as the base.
See {ref}this guide <docker-images> to learn more about the official Ray images.
For dynamic dependency management geared towards iteration and development,
you can also use {ref}Runtime Environments <runtime-environments>.
For kuberay-operator versions 1.1.0 and later, the Ray container image must have wget installed in it.
The KubeRay operator will ignore the values of metadata.name and metadata.generateName set by users.
The KubeRay operator will generate a generateName automatically to avoid name conflicts.
See KubeRay issue #587 for more details.
(rayStartParams)=
The rayStartParams field of each group spec is a string-string map of arguments to the Ray
container’s ray start entrypoint. For the full list of arguments, refer to
the documentation for {ref}ray start <ray-start-doc>. The RayCluster Kubernetes custom resource
Custom Resource Definition (CRD) in KubeRay versions before 1.4.0 required this field to exist, but the value could be
an empty map. As of KubeRay 1.4.0, rayStartParams is optional.
Note the following arguments:
For most use-cases, this field should be set to "0.0.0.0" for the Ray head pod. This is required to expose the Ray dashboard outside the Ray cluster. (Future versions might set this parameter by default.)
(kuberay-num-cpus)=
This optional field tells the Ray scheduler and autoscaler how many CPUs are
available to the Ray pod. The CPU count can be autodetected from the
Kubernetes resource limits specified in the group spec’s pod
template. However, it is sometimes useful to override this autodetected
value. For example, setting num-cpus:"0" for the Ray head pod will prevent Ray
workloads with non-zero CPU requirements from being scheduled on the head.
Note that the values of all Ray start parameters, including num-cpus,
must be supplied as strings.
This field specifies the number of GPUs available to the Ray container.
In future KubeRay versions, the number of GPUs will be auto-detected from Ray container resource limits.
Note that the values of all Ray start parameters, including num-gpus,
must be supplied as strings.
The memory available to the Ray is detected automatically from the Kubernetes resource
limits. If you wish, you may override this autodetected value by setting the desired memory value,
in bytes, under rayStartParams.memory.
Note that the values of all Ray start parameters, including memory,
must be supplied as strings.
This field can be used to specify custom resource capacities for the Ray pod.
These resource capacities will be advertised to the Ray scheduler and Ray autoscaler.
For example, the following annotation will mark a Ray pod as having 1 unit of Custom1 capacity
and 5 units of Custom2 capacity.
rayStartParams:
resources: '"{\"Custom1\": 1, \"Custom2\": 5}"'
You can then annotate tasks and actors with annotations like @ray.remote(resources={"Custom2": 1}).
The Ray scheduler and autoscaler will take appropriate action to schedule such tasks.
Note the format used to express the resources string. In particular, note
that the backslashes are present as actual characters in the string.
If you are specifying a RayCluster programmatically, you may have to
escape the backslashes to make sure they are processed as part of the string.
The field rayStartParams.resources should only be used for custom resources. The keys
CPU, GPU, and memory are forbidden. If you need to specify overrides for those resource
fields, use the Ray start parameters num-cpus, num-gpus, or memory.
(kuberay-networking)=
The KubeRay operator automatically configures a Kubernetes Service exposing the default ports for several services of the Ray head pod, including
The name of the configured Kubernetes Service is the name, metadata.name, of the RayCluster
followed by the suffix <nobr>head-svc</nobr>. For the example CR given on this page, the name of
the head service will be
<nobr>raycluster-example-head-svc</nobr>. Kubernetes networking (kube-dns) then allows us to address
the Ray head's services using the name <nobr>raycluster-example-head-svc</nobr>.
For example, the Ray Client server can be accessed from a pod
in the same Kubernetes namespace using
ray.init("ray://raycluster-example-head-svc:10001")
The Ray Client server can be accessed from a pod in another namespace using
ray.init("ray://raycluster-example-head-svc.default.svc.cluster.local:10001")
(This assumes the Ray cluster was deployed into the default Kubernetes namespace.
If the Ray cluster is deployed in a non-default namespace, use that namespace in
place of default.)
If you wish to override the ports exposed by the Ray head service, you may do so by specifying
the Ray head container's ports list, under headGroupSpec.
Here is an example of a list of non-default ports for the Ray head service.
ports:
- containerPort: 6380
name: gcs
- containerPort: 8266
name: dashboard
- containerPort: 10002
name: client
If the head container's ports list is specified, the Ray head service will expose precisely
the ports in the list. In the above example, the head service will expose just three ports;
in particular there will be no port exposed for Ray Serve.
For the Ray head to actually use the non-default ports specified in the ports list,
you must also specify the relevant rayStartParams. For the above example,
rayStartParams:
port: "6380"
dashboard-port: "8266"
ray-client-server-port: "10002"
...