docs/operations/manage-credentials.mdx
This guide explains how to manage credentials (API keys) in TensorZero Gateway.
Typically, the TensorZero Gateway will look for credentials like API keys using standard environment variables. The gateway will load credentials from the environment variables on startup, and your application doesn't need to have access to the credentials.
That said, you can customize this behavior by setting alternative credential locations for each provider. For example, you can provide credentials dynamically at inference time, or set alternative static credentials for each provider (e.g. to use multiple API keys for the same provider).
By default, the TensorZero Gateway will look for credentials in the following environment variables:
| Model Provider | Default Credential |
|---|---|
| Anthropic | ANTHROPIC_API_KEY |
| AWS Bedrock | Uses AWS SDK credentials |
| AWS SageMaker | Uses AWS SDK credentials |
| Azure | AZURE_API_KEY |
| Deepseek | DEEPSEEK_API_KEY |
| Fireworks | FIREWORKS_API_KEY |
| GCP Vertex AI (Anthropic) | GCP_VERTEX_CREDENTIALS_PATH |
| GCP Vertex AI (Gemini) | GCP_VERTEX_CREDENTIALS_PATH |
| Google AI Studio (Gemini) | GOOGLE_API_KEY |
| Groq | GROQ_API_KEY |
| Hyperbolic | HYPERBOLIC_API_KEY |
| Mistral | MISTRAL_API_KEY |
| OpenAI | OPENAI_API_KEY |
| OpenAI-Compatible | OPENAI_API_KEY |
| OpenRouter | OPENROUTER_API_KEY |
| SGLang | SGLANG_API_KEY |
| Text Generation Inference (TGI) | None |
| Together | TOGETHER_API_KEY |
| vLLM | None |
| XAI | XAI_API_KEY |
You can customize the source of credentials for each provider.
See Configuration Reference (e.g. api_key_location) for more information on the different ways to configure credentials for each provider.
Also see the relevant provider guides for more information on how to configure credentials for each provider.
You can set alternative static credentials for each provider.
For example, let's say we want to use a different environment variable for an OpenAI provider.
We can customize variable name by setting the api_key_location to env::MY_OTHER_OPENAI_API_KEY.
[models.gpt_4o_mini.providers.my_other_openai]
type = "openai"
api_key_location = "env::MY_OTHER_OPENAI_API_KEY"
# ...
At startup, the TensorZero Gateway will look for the MY_OTHER_OPENAI_API_KEY environment variable and use that value for the API key.
You can load balance between different API keys for the same provider by defining multiple variants and models.
For example, the configuration below will split the traffic between two different OpenAI API keys, OPENAI_API_KEY_1 and OPENAI_API_KEY_2.
[models.gpt_4o_mini_1]
routing = ["openai"]
[models.gpt_4o_mini_1.providers.openai]
type = "openai"
model_name = "gpt-4o-mini"
api_key_location = "env::OPENAI_API_KEY_1"
[models.gpt_4o_mini_2]
routing = ["openai"]
[models.gpt_4o_mini_2.providers.openai]
type = "openai"
model_name = "gpt-4o-mini"
api_key_location = "env::OPENAI_API_KEY_2"
[functions.generate_haiku]
type = "chat"
[functions.generate_haiku.variants.gpt_4o_mini_1]
type = "chat_completion"
model = "gpt_4o_mini_1"
[functions.generate_haiku.variants.gpt_4o_mini_2]
type = "chat_completion"
model = "gpt_4o_mini_2"
You can use the same principle to set up fallbacks between different API keys for the same provider. See Retries & Fallbacks for more information on how to configure retries and fallbacks.
</Tip>You can provide API keys dynamically at inference time.
To do this, you can use the dynamic:: prefix in the relevant credential field in the provider configuration.
For example, let's say we want to provide dynamic API keys for the OpenAI provider.
[models.user_gpt_4o_mini]
routing = ["openai"]
[models.user_gpt_4o_mini.providers.openai]
type = "openai"
model_name = "gpt-4o-mini"
api_key_location = "dynamic::customer_openai_api_key"
At inference time, you can provide the API key in the tensorzero::credentials field.
from openai import OpenAI
client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")
response = client.chat.completions.create(
model="tensorzero::function_name::generate_haiku",
messages=[
{
"role": "user",
"content": "Write a haiku about TensorZero.",
}
],
extra_body={
"tensorzero::credentials": {
"customer_openai_api_key": "sk-..."
}
},
)
print(response)
You can configure fallback credentials that will be used automatically if the primary credential fails.
This is particularly useful for calling functions and models that require dynamic credentials from the TensorZero UI (by falling back to static credentials).
To configure a fallback, use an object with default and fallback fields instead of a simple string:
[models.gpt_4o_mini]
routing = ["openai"]
[models.gpt_4o_mini.providers.openai]
type = "openai"
model_name = "gpt-4o-mini"
api_key_location = { default = "dynamic::customer_openai_api_key", fallback = "env::OPENAI_API_KEY" }
At inference time, the gateway will first try to use the dynamic credential. If that fails, it will automatically fall back to the environment variable.
Most model providers have default credential locations.
For example, OpenAI's api_key_location defaults to env::OPENAI_API_KEY.
These credentials apply to the default function and shorthand models (e.g. calling the model openai::gpt-5).
You can override the default location for a particular provider using [provider_types.YOUR_PROVIDER_TYPE.defaults].
For example, we can override the default location for the OpenAI provider type to require a dynamic API key:
[provider_types.openai.defaults]
api_key_location = "dynamic::customer_openai_api_key"
# ...
Unless otherwise specified, every model provider of type openai will require the customer_openai_api_key credential.
See the Configuration Reference for more details.
If you have multiple TensorZero deployments (e.g. one per team), you can centralize credential management using gateway relay.
With gateway relay, an LLM inference request can be routed through multiple independent TensorZero Gateway deployments before reaching a model provider. This enables you to enforce organization-wide controls without restricting how teams build their LLM features.
See Centralize auth, rate limits, and more for details.