charts/README.md
A production-ready Helm chart for deploying MLflow on Kubernetes.
namespace_rbac) and cluster-scoped (cluster_rbac) rulesmlflow gchelm install mlflow ./charts --namespace mlflow --create-namespace
With custom values:
helm install mlflow ./charts \
--namespace mlflow \
--create-namespace \
-f my-values.yaml
The simplest way to get a shared MLflow instance running on a cluster — no external database or object store required. MLflow stores metadata in SQLite and artifacts on a PersistentVolumeClaim:
helm install mlflow ./charts \
--namespace mlflow \
--create-namespace \
--set storage.enabled=true \
--set mlflow.backendStoreUri="sqlite:////mlflow/mlflow.db" \
--set mlflow.defaultArtifactRoot="/mlflow/artifacts"
Access the UI via port-forward:
kubectl port-forward -n mlflow svc/mlflow-mlflow 5000:5000
Then open http://localhost:5000 in your browser.
Note: SQLite and local file storage are not suitable for production or high-concurrency use. For production deployments see the Backend store and Artifact store sections below.
See values.yaml for the full list of configurable parameters.
Common scenarios are described below.
Inline URI (password visible in values — use only for development):
mlflow:
backendStoreUri: "postgresql://user:password@postgres:5432/mlflow"
Read from a Kubernetes Secret (recommended for production):
kubectl create secret generic mlflow-db-secret \
--from-literal=uri="postgresql://user:password@postgres:5432/mlflow"
mlflow:
backendStoreUriFrom:
secretKeyRef:
name: mlflow-db-secret
key: uri
mlflow:
defaultArtifactRoot: "s3://my-bucket/mlflow"
artifactsDestination: "s3://my-bucket/mlflow"
env:
- name: AWS_ACCESS_KEY_ID
valueFrom:
secretKeyRef:
name: s3-credentials
key: access-key-id
- name: AWS_SECRET_ACCESS_KEY
valueFrom:
secretKeyRef:
name: s3-credentials
key: secret-access-key
storage:
enabled: true
size: 10Gi
mlflow:
backendStoreUri: "sqlite:////mlflow/mlflow.db"
defaultArtifactRoot: "/mlflow/artifacts"
kubectl create secret tls mlflow-tls \
--cert=tls.crt \
--key=tls.key
tls:
enabled: true
secretName: mlflow-tls
MLflow's host-validation middleware only allows localhost and private-IP hosts by default.
When exposing MLflow through an Ingress with a public hostname, set allowed_hosts to match
that hostname, otherwise requests will be rejected with HTTP 403.
server:
value_options:
allowed_hosts: "mlflow.example.com"
ingress:
enabled: true
className: nginx
hosts:
- host: mlflow.example.com
paths:
- path: /
pathType: Prefix
tls:
- secretName: mlflow-tls
hosts:
- mlflow.example.com
metrics:
enabled: true
path: /metrics
serviceMonitor:
enabled: true
Periodically remove soft-deleted runs, experiments, and their artifacts:
garbageCollection:
enabled: true
schedule: "0 2 * * 0" # weekly at 2 AM on Sunday
olderThan: "30d" # only remove resources soft-deleted for 30+ days
resources:
requests:
cpu: 500m
memory: 512Mi
limits:
cpu: 1000m
memory: 1Gi
See example-mlflow-charts.yaml for a production-oriented example with a PostgreSQL backend, S3 artifact store, and Secret-based credential injection.
helm install mlflow ./charts \
--namespace mlflow \
--create-namespace \
-f charts/example-mlflow-charts.yaml
helm upgrade mlflow ./charts --namespace mlflow -f my-values.yaml
helm uninstall mlflow --namespace mlflow
Note:
helm uninstalldoes not delete PersistentVolumeClaims. Ifstorage.enabled=true, delete the PVC manually after uninstalling:bashkubectl delete pvc -n mlflow --all