k8s/charts/quickstart/README.md
A Helm chart for running Daft data processing workloads on Kubernetes, with support for both simple single-node execution and distributed Ray-based processing.
Run a quick Daft job using the native runner:
# Create a simple Python script
cat > my_script.py <<'EOF'
# /// script
# dependencies = ["daft"]
# ///
import daft
df = daft.from_pydict({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35]
})
df.filter(df["age"] > 25).show()
EOF
# Deploy
helm install my-job oci://ghcr.io/eventual-inc/daft/quickstart \
--set-file job.script=my_script.py
# View logs
kubectl logs -f job/my-job-quickstart-job
# Cleanup
helm uninstall my-job
Run a Daft job on a Ray cluster with multiple workers:
# Create a simple Python script with Ray
cat > my_script.py <<'EOF'
# /// script
# dependencies = ["daft", "ray[client]==2.46.0"]
# ///
import daft
import ray
ray.init(runtime_env={"pip": ["daft"]})
df = daft.from_pydict({
"name": ["Alice", "Bob", "Charlie"],
"age": [25, 30, 35]
})
df.filter(df["age"] > 25).show()
EOF
# Deploy with Ray cluster
helm install distributed-job oci://ghcr.io/eventual-inc/daft/quickstart \
--set distributed=true \
--set worker.replicas=3 \
--set-file job.script=my_script.py
# Access Ray Dashboard
kubectl port-forward service/distributed-job-quickstart-head 8265:8265
# Open http://localhost:8265
# Cleanup
helm uninstall distributed-job
This chart also supports running Daft jobs from custom images. This is useful in scenarios where more complex dependencies are required.
helm install custom-image-job oci://ghcr.io/eventual-inc/daft/quickstart \
--set image=my-image:tag \
--set "job.command={python,my_script.py}"
See values.yaml for detailed descriptions of all supported parameters.
# Get job status
kubectl get jobs -l app.kubernetes.io/instance=my-release
# View job logs
kubectl logs -f job/my-release-quickstart-job
# Check pod status
kubectl get pods -l app.kubernetes.io/instance=my-release
# Port forward to access both Ray Dashboard and Grafana
kubectl port-forward service/my-release-quickstart-head 8265:8265 3000:3000
# Open http://localhost:8265 in browser
# Grafana panels will be embedded directly in the Ray Dashboard
# Optionally open http://localhost:3000 to access Grafana directly
# Default credentials: admin/admin
# Get cluster status
kubectl exec deployment/my-release-quickstart-head -- ray status
# View cluster resources
kubectl exec deployment/my-release-quickstart-head -- python -c "import ray; ray.init(); print(ray.cluster_resources())"
# Scale workers using kubectl
kubectl scale deployment my-release-quickstart-worker --replicas=10
# Check job events
kubectl describe job my-release-quickstart-job
# Check pod events
kubectl describe pod -l app.kubernetes.io/component=job
# View pod logs
kubectl logs -l app.kubernetes.io/component=job
# Check head service
kubectl get service -l ray.io/node-type=head
# Check head logs
kubectl logs deployment/my-release-quickstart-head
# Check worker logs
kubectl logs deployment/my-release-quickstart-worker
Increase memory limits in values:
job:
resources:
limits:
memory: "16Gi" # Increase as needed
head:
resources:
limits:
memory: "32Gi" # Increase as needed
worker:
resources:
limits:
memory: "32Gi" # Increase as needed
Ensure you're using --set-file (not --set):
# Correct
helm install my-job oci://ghcr.io/eventual-inc/daft/quickstart --set-file job.script=script.py
# Incorrect
helm install my-job oci://ghcr.io/eventual-inc/daft/quickstart --set job.script=script.py
Your custom image must include ray[default]:
# requirements.txt
daft
ray[default]
helm uninstall my-release
This removes all Kubernetes resources created by the chart.
Apache 2.0
Contributions are welcome! Please see the Daft repository for contribution guidelines.