Back to Vllm

KubeRay

docs/deployment/integrations/kuberay.md

0.20.11.4 KB
Original Source

KubeRay

KubeRay provides a Kubernetes-native way to run vLLM workloads on Ray clusters. A Ray cluster can be declared in YAML, and the operator then handles pod scheduling, networking configuration, restarts, and blue-green deployments — all while preserving the familiar Kubernetes experience.

Why KubeRay instead of manual scripts?

FeatureManual scriptsKubeRay
Cluster bootstrapManually SSH into every node and run a scriptOne command to create or update the whole cluster: kubectl apply -f cluster.yaml
AutoscalingManualAutomatically patches CRDs for adjusting cluster size
UpgradesTear down & re-create manuallyBlue/green deployment updates supported
Declarative configBash flags & environment variablesGit-ops-friendly YAML CRDs (RayCluster/RayService)

Using KubeRay reduces the operational burden and simplifies integration of Ray + vLLM with existing Kubernetes workflows (CI/CD, secrets, storage classes, etc.).

Learn more