examples/production-deployment-k8s-helm/README.md
This example shows how to deploy the TensorZero (including the TensorZero Gateway, the TensorZero UI, and a ClickHouse database) on Kubernetes using Helm.
Our CI pipeline automatically bumps the chart's version and publishes it to ArtifactHub when a new GitHub release is created.
traefik-ingress-controller-v3)ebs-gp3-retain)monitoring.metrics.enabled is set, Prometheus Operator needs to be installed in your clusterTo install the chart with the release name tensorzero:
# Create a namespace for tensorzero
kubectl create namespace tensorzero
# Install the chart
helm upgrade --install tensorzero . -f values.yaml -n tensorzero
For local development or testing with minikube, you can use port forwarding to access the services:
# Port forward the gateway service
kubectl port-forward service/tensorzero-gateway -n tensorzero 3000:3000 &
# Port forward the UI service
kubectl port-forward service/tensorzero-ui -n tensorzero 4000:4000 &
Before installation, you need to create a secret with the following environment variables:
kubectl create secret generic tensorzero-secret -n tensorzero \
--from-literal=TENSORZERO_CLICKHOUSE_URL="http://default:[email protected]:8123" \
--from-literal=TENSORZERO_GATEWAY_URL="http://tensorzero-gateway.tensorzero.svc.cluster.local:3000" \
--from-literal=OPENAI_API_KEY="your-openai-api-key"
# ... include model provider credentials as needed ...
Note: The
TENSORZERO_CLICKHOUSE_URLandTENSORZERO_GATEWAY_URLvalues are the default values for the TensorZero Gateway and ClickHouse service names. If you have changed the service names, you need to update the secret with the correct values.
To uninstall the tensorzero deployment, run:
helm uninstall tensorzero -n tensorzero
The following table lists the configurable parameters of the chart and their default values.
| Parameter | Description | Default |
|---|---|---|
gateway.replicaCount | Number of gateway replicas | 1 |
gateway.serviceAccountName | Service account for gateway pods | "" |
gateway.image.repository | Gateway image repository | tensorzero/gateway |
gateway.image.tag | Gateway image tag | latest |
gateway.image.pullPolicy | Gateway image pull policy | IfNotPresent |
gateway.service.type | Gateway service type | ClusterIP |
gateway.service.port | Gateway service port | 3000 |
gateway.resources.limits | Gateway resource limits | cpu: 2000m, memory: 4096Mi |
gateway.resources.requests | Gateway resource requests | cpu: 2000m, memory: 4096Mi |
gateway.securityContext | Gateway pod security context | runAsNonRoot: true, runAsUser: 1000, runAsGroup: 1000, fsGroup: 1000 |
gateway.ingress.enabled | Enable gateway ingress | true |
gateway.ingress.className | Gateway ingress class | traefik-ingress-controller-v3 |
gateway.ingress.hosts | Gateway ingress hosts | tensorzero-gateway.local |
| Parameter | Description | Default |
|---|---|---|
ui.deploy | Whether to deploy the UI | true |
ui.replicaCount | Number of UI replicas | 1 |
ui.serviceAccountName | Service account for UI pods | "" |
ui.image.repository | UI image repository | tensorzero/ui |
ui.image.tag | UI image tag | latest |
ui.image.pullPolicy | UI image pull policy | IfNotPresent |
ui.service.type | UI service type | ClusterIP |
ui.service.port | UI service port | 4000 |
ui.resources.limits | UI resource limits | cpu: 1000m, memory: 1024Mi |
ui.resources.requests | UI resource requests | cpu: 500m, memory: 512Mi |
ui.securityContext | UI pod security context | runAsNonRoot: true, runAsUser: 1000, runAsGroup: 1000, fsGroup: 1000 |
ui.ingress.enabled | Enable UI ingress | true |
ui.ingress.className | UI ingress class | traefik-ingress-controller-v3 |
ui.ingress.hosts | UI ingress hosts | tensorzero-ui.local |
| Parameter | Description | Default |
|---|---|---|
persistence.enabled | Enable persistent storage | false |
persistence.size | Storage size | 10Gi |
persistence.accessModes | Access modes | ["ReadWriteOnce"] |
persistence.storageClass | Storage class name | "" |
persistence.mountPath | Mount path in containers | /app/storage |
| Parameter | Description | Default |
|---|---|---|
monitoring.metrics.enabled | Enable ServiceMonitor creation | false |
monitoring.metrics.interval | Scrape interval | "30s" |
monitoring.metrics.labels | Additional labels to attach to ServiceMonitor | {} |
This chart requires a ClickHouse instance for observability. We recommend using Altinity's ClickHouse Helm chart, which offers better cross-platform support (including ARM64 architecture).
Important: TensorZero doesn't support legacy ClickHouse versions. We recommend using the
altinity/clickhouse-server:24.8.14.10459.altinitystableimage or newer.
To deploy ClickHouse using Altinity's Helm chart:
Add the Altinity Helm repository:
helm repo add altinity https://altinity.github.io/helm-charts
helm repo update
Deploy a ClickHouse instance:
# Create a namespace for ClickHouse
kubectl create namespace clickhouse
# Install the ClickHouse chart using the provided clickhouse-values.yaml
# which configures the image version and authentication
helm install clickhouse altinity/clickhouse -n clickhouse -f clickhouse-values.yaml
Note: The Gateway will automatically create the necessary database when it first connects to ClickHouse.
Update your TensorZero values file to disable the built-in ClickHouse and specify the external ClickHouse in your secret:
kubectl create secret generic tensorzero-secret -n tensorzero \
--from-literal=TENSORZERO_CLICKHOUSE_URL="http://default:[email protected]:8123/tensorzero" \
--from-literal=TENSORZERO_GATEWAY_URL="http://tensorzero-gateway.tensorzero.svc.cluster.local:3000" \
--from-literal=OPENAI_API_KEY="your-openai-api-key"
# ... include model provider credentials as needed ...
ConfigMap ConfigurationThe chart includes a ConfigMap with the following default configuration:
You can customize the installation by creating a values file custom-values.yaml:
gateway:
replicaCount: 2
resources:
limits:
cpu: 4000m
memory: 8192Mi
ui:
replicaCount: 2
clickhouse:
replicaCount: 3
persistence:
size: 500Gi
Then install with:
helm install tensorzero ./tensorzero -n tensorzero -f custom-values.yaml
tensorzero-secret with specific environment variables.values.yaml file.After successful deployment, you can call the gateway endpoint using curl. Here's an example:
curl -X POST http://localhost:3000/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "tensorzero::model_name::openai::gpt-4o-mini",
"messages": [
{
"role": "user",
"content": "What is the capital of Japan?"
}
]
}'
Note: If you're using port forwarding to access the gateway locally, use http://localhost:3000 as the endpoint. If you're using the ingress, replace with your actual gateway ingress host as configured in the gateway.ingress.hosts value.