Back to Tensorzero

Deploying TensorZero on Kubernetes with Helm

examples/production-deployment-k8s-helm/README.md

2026.4.110.8 KB
Original Source

Deploying TensorZero on Kubernetes with Helm

This example shows how to deploy the TensorZero (including the TensorZero Gateway, the TensorZero UI, and a ClickHouse database) on Kubernetes using Helm.

Our CI pipeline automatically bumps the chart's version and publishes it to ArtifactHub when a new GitHub release is created.

Prerequisites

  • Kubernetes 1.19+
  • Helm 3.2.0+
  • Ingress controller installed in your cluster (e.g. traefik-ingress-controller-v3)
  • StorageClass configured for persistent volumes (e.g. ebs-gp3-retain)
  • Sufficient resources for running ClickHouse and TensorZero services (recommend at least 4GB memory for minikube)
  • If monitoring.metrics.enabled is set, Prometheus Operator needs to be installed in your cluster

Installing the Chart

To install the chart with the release name tensorzero:

bash
# Create a namespace for tensorzero
kubectl create namespace tensorzero

# Install the chart
helm upgrade --install tensorzero .  -f values.yaml -n tensorzero

For local development or testing with minikube, you can use port forwarding to access the services:

bash
# Port forward the gateway service
kubectl port-forward service/tensorzero-gateway -n tensorzero 3000:3000 &

# Port forward the UI service
kubectl port-forward service/tensorzero-ui -n tensorzero 4000:4000 &

Required Secret Configuration

Before installation, you need to create a secret with the following environment variables:

bash
kubectl create secret generic tensorzero-secret -n tensorzero \
  --from-literal=TENSORZERO_CLICKHOUSE_URL="http://default:[email protected]:8123" \
  --from-literal=TENSORZERO_GATEWAY_URL="http://tensorzero-gateway.tensorzero.svc.cluster.local:3000" \
  --from-literal=OPENAI_API_KEY="your-openai-api-key"
  # ... include model provider credentials as needed ...

Note: The TENSORZERO_CLICKHOUSE_URL and TENSORZERO_GATEWAY_URL values are the default values for the TensorZero Gateway and ClickHouse service names. If you have changed the service names, you need to update the secret with the correct values.

Uninstalling the Chart

To uninstall the tensorzero deployment, run:

bash
helm uninstall tensorzero -n tensorzero

Configuration

The following table lists the configurable parameters of the chart and their default values.

Gateway Configuration

ParameterDescriptionDefault
gateway.replicaCountNumber of gateway replicas1
gateway.serviceAccountNameService account for gateway pods""
gateway.image.repositoryGateway image repositorytensorzero/gateway
gateway.image.tagGateway image taglatest
gateway.image.pullPolicyGateway image pull policyIfNotPresent
gateway.service.typeGateway service typeClusterIP
gateway.service.portGateway service port3000
gateway.resources.limitsGateway resource limitscpu: 2000m, memory: 4096Mi
gateway.resources.requestsGateway resource requestscpu: 2000m, memory: 4096Mi
gateway.securityContextGateway pod security contextrunAsNonRoot: true, runAsUser: 1000, runAsGroup: 1000, fsGroup: 1000
gateway.ingress.enabledEnable gateway ingresstrue
gateway.ingress.classNameGateway ingress classtraefik-ingress-controller-v3
gateway.ingress.hostsGateway ingress hoststensorzero-gateway.local

UI Configuration

ParameterDescriptionDefault
ui.deployWhether to deploy the UItrue
ui.replicaCountNumber of UI replicas1
ui.serviceAccountNameService account for UI pods""
ui.image.repositoryUI image repositorytensorzero/ui
ui.image.tagUI image taglatest
ui.image.pullPolicyUI image pull policyIfNotPresent
ui.service.typeUI service typeClusterIP
ui.service.portUI service port4000
ui.resources.limitsUI resource limitscpu: 1000m, memory: 1024Mi
ui.resources.requestsUI resource requestscpu: 500m, memory: 512Mi
ui.securityContextUI pod security contextrunAsNonRoot: true, runAsUser: 1000, runAsGroup: 1000, fsGroup: 1000
ui.ingress.enabledEnable UI ingresstrue
ui.ingress.classNameUI ingress classtraefik-ingress-controller-v3
ui.ingress.hostsUI ingress hoststensorzero-ui.local

Persistence Configuration

ParameterDescriptionDefault
persistence.enabledEnable persistent storagefalse
persistence.sizeStorage size10Gi
persistence.accessModesAccess modes["ReadWriteOnce"]
persistence.storageClassStorage class name""
persistence.mountPathMount path in containers/app/storage

Monitoring Configuration

ParameterDescriptionDefault
monitoring.metrics.enabledEnable ServiceMonitor creationfalse
monitoring.metrics.intervalScrape interval"30s"
monitoring.metrics.labelsAdditional labels to attach to ServiceMonitor{}

ClickHouse Configuration

This chart requires a ClickHouse instance for observability. We recommend using Altinity's ClickHouse Helm chart, which offers better cross-platform support (including ARM64 architecture).

Important: TensorZero doesn't support legacy ClickHouse versions. We recommend using the altinity/clickhouse-server:24.8.14.10459.altinitystable image or newer.

To deploy ClickHouse using Altinity's Helm chart:

  1. Add the Altinity Helm repository:

    bash
    helm repo add altinity https://altinity.github.io/helm-charts
    helm repo update
    
  2. Deploy a ClickHouse instance:

    bash
    # Create a namespace for ClickHouse
    kubectl create namespace clickhouse
    
    # Install the ClickHouse chart using the provided clickhouse-values.yaml
    # which configures the image version and authentication
    helm install clickhouse altinity/clickhouse -n clickhouse -f clickhouse-values.yaml
    

    Note: The Gateway will automatically create the necessary database when it first connects to ClickHouse.

  3. Update your TensorZero values file to disable the built-in ClickHouse and specify the external ClickHouse in your secret:

    bash
    kubectl create secret generic tensorzero-secret -n tensorzero \
      --from-literal=TENSORZERO_CLICKHOUSE_URL="http://default:[email protected]:8123/tensorzero" \
      --from-literal=TENSORZERO_GATEWAY_URL="http://tensorzero-gateway.tensorzero.svc.cluster.local:3000" \
      --from-literal=OPENAI_API_KEY="your-openai-api-key"
      # ... include model provider credentials as needed ...
    

ConfigMap Configuration

The chart includes a ConfigMap with the following default configuration:

  • Model configuration for Claude Haiku 4.5
  • Function configuration for chat completions

You can customize the installation by creating a values file custom-values.yaml:

yaml
gateway:
  replicaCount: 2
  resources:
    limits:
      cpu: 4000m
      memory: 8192Mi

ui:
  replicaCount: 2

clickhouse:
  replicaCount: 3
  persistence:
    size: 500Gi

Then install with:

bash
helm install tensorzero ./tensorzero -n tensorzero -f custom-values.yaml

Important Notes

  1. The chart requires a secret named tensorzero-secret with specific environment variables.
  2. In production, never store sensitive data in your version-controlled values.yaml file.
  3. Make sure your cluster has sufficient resources for the configured replicas and resource limits.
  4. The ingress configuration assumes you have a working ingress controller installed.

Calling the Gateway Endpoint

After successful deployment, you can call the gateway endpoint using curl. Here's an example:

bash
curl -X POST http://localhost:3000/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "tensorzero::model_name::openai::gpt-4o-mini",
    "messages": [
      {
        "role": "user",
        "content": "What is the capital of Japan?"
      }
    ]
  }'

Note: If you're using port forwarding to access the gateway locally, use http://localhost:3000 as the endpoint. If you're using the ingress, replace with your actual gateway ingress host as configured in the gateway.ingress.hosts value.