docs/deployment/tensorzero-gateway.mdx
The TensorZero Gateway is the core component that handles inference requests and collects observability data. It's easy to get started with the TensorZero Gateway.
The gateway requires one of the following command line arguments:
--default-config: Use default configuration settings.--config-file path/to/tensorzero.toml: Use a custom configuration file.
<Tip>
--config-file supports glob patterns, e.g. --config-file /path/to/**/*.toml.
</Tip>--run-postgres-migrations: Run Postgres database migrations and exit.--run-clickhouse-migrations: Run ClickHouse database migrations and exit.There are many ways to deploy the TensorZero Gateway. Here are a few examples:
<AccordionGroup> <Accordion title="Run with Docker">You can easily run the TensorZero Gateway locally using Docker.
If you don't have custom configuration, you can use:
docker run \
--env-file .env \
-p 3000:3000 \
tensorzero/gateway \
--default-config
If you have custom configuration, you can use:
docker run \
-v "./config:/app/config" \
--env-file .env \
-p 3000:3000 \
tensorzero/gateway \
--config-file config/tensorzero.toml
You can run the TensorZero Gateway using Docker Compose:
services:
gateway:
image: tensorzero/gateway
volumes:
- ./config:/app/config:ro
command: --config-file /app/config/tensorzero.toml
env_file:
- ${ENV_FILE:-.env}
ports:
- "3000:3000"
restart: unless-stopped
extra_hosts:
- "host.docker.internal:host-gateway"
healthcheck:
test: wget --spider --tries 1 http://localhost:3000/status
interval: 15s
timeout: 1s
retries: 2
Make sure to create a .env file with the relevant environment variables (e.g. TENSORZERO_POSTGRES_URL and model provider API keys).
We provide a reference Helm chart in our GitHub repository. You can use it to run TensorZero in Kubernetes.
The chart is available on ArtifactHub.
</Accordion> <Accordion title="Build from source">You can build the TensorZero Gateway from source and run it directly on your host machine using Cargo.
cargo run --profile performance --bin gateway -- --config-file path/to/your/tensorzero.toml
See the optimizing latency and throughput guide to learn how to configure the gateway for high-performance deployments.
</Tip>The TensorZero Gateway accepts the following environment variables for provider credentials. Unless you specify an alternative credential location in your configuration file, these environment variables are required for the providers that are used in a variant that can be sampled. If required credentials are missing, the gateway will fail on startup.
Unless customized in your configuration file, the following credentials are used by default:
| Provider | Environment Variable(s) |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| AWS Bedrock | AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY (see details) |
| AWS SageMaker | AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY (see details) |
| Azure OpenAI | AZURE_API_KEY |
| Fireworks | FIREWORKS_API_KEY |
| GCP Vertex AI Anthropic | GCP_VERTEX_CREDENTIALS_PATH (see details) |
| GCP Vertex AI Gemini | GCP_VERTEX_CREDENTIALS_PATH (see details) |
| Google AI Studio Gemini | GOOGLE_AI_STUDIO_GEMINI_API_KEY |
| Groq | GROQ_API_KEY |
| Hyperbolic | HYPERBOLIC_API_KEY |
| Mistral | MISTRAL_API_KEY |
| OpenAI | OPENAI_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
| Together | TOGETHER_API_KEY |
| xAI | XAI_API_KEY |
Optionally, you can use a configuration file to customize the behavior of the gateway. See Configuration Reference for more details.
<Accordion title="Disable pseudonymous usage analytics">TensorZero collects pseudonymous usage analytics to help our team improve the product.
The collected data includes aggregated metrics about TensorZero itself, but does NOT include your application's data. To be explicit: TensorZero does NOT share any inference input or output. TensorZero also does NOT share the name of any function, variant, metric, or similar application-specific identifiers.
See howdy.rs in the GitHub repository to see exactly what usage data is collected and shared with TensorZero.
To disable usage analytics, set the following configuration in the tensorzero.toml file:
[gateway]
disable_pseudonymous_usage_analytics = true
Alternatively, you can also set the environment variable TENSORZERO_DISABLE_PSEUDONYMOUS_USAGE_ANALYTICS=1.
Optionally, the TensorZero Gateway can collect inference and feedback data for observability, optimization, evaluations, and experimentation. TensorZero supports both Postgres and ClickHouse as observability backends. Postgres is the simpler choice; we recommend ClickHouse if you're handling >100 inferences per second.
If you don't provide either of the environment variables below, observability will be disabled. We recommend setting up observability early to monitor your LLM application and collect data for future optimization, but this can be done incrementally as needed.
After deploying Postgres, you need to configure the TENSORZERO_POSTGRES_URL environment variable with the connection details.
After deploying ClickHouse, you need to configure the TENSORZERO_CLICKHOUSE_URL environment variable with the connection details.
By default, the gateway binds to 0.0.0.0:3000.
You can customize the bind address and port in the configuration (gateway.bind_address), with the --bind-address CLI flag, or with the TENSORZERO_GATEWAY_BIND_ADDRESS environment variable.
Only one of these methods can be used at a time.
[gateway]
bind_address = "0.0.0.0:3000"
gateway --bind-address 0.0.0.0:3000 --config-file tensorzero.toml
TENSORZERO_GATEWAY_BIND_ADDRESS=0.0.0.0:3000 gateway --config-file tensorzero.toml
Optionally, you can provide the following command line argument to customize the gateway's logging format:
--log-format: Set the logging format to either pretty (default) or json.The TensorZero Gateway exposes endpoints for status and health checks.
The /status endpoint checks that the gateway is running successfully.
{ "status": "ok" }
The /health endpoint additionally checks that it can communicate with dependencies like Postgres or ClickHouse (if enabled).
{ "gateway": "ok", "postgres": "ok" }