docs/integrations/model-providers/google-ai-studio-gemini.mdx
This guide shows how to set up a minimal deployment to use the TensorZero Gateway with Google AI Studio (Gemini API).
You can use the short-hand google_ai_studio_gemini::model_name to use a Google AI Studio (Gemini API) model with TensorZero, unless you need advanced features like fallbacks or custom credentials.
You can use Google AI Studio (Gemini API) models in your TensorZero variants by setting the model field to google_ai_studio_gemini::model_name.
For example:
[functions.my_function_name.variants.my_variant_name]
type = "chat_completion"
model = "google_ai_studio_gemini::gemini-2.5-flash-lite"
Additionally, you can set the model parameter in the OpenAI-compatible inference endpoint to use a specific Google AI Studio (Gemini API) model, without having to configure a function and variant in TensorZero.
curl -X POST http://localhost:3000/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "tensorzero::model_name::google_ai_studio_gemini::gemini-2.5-flash-lite",
"messages": [
{
"role": "user",
"content": "What is the capital of Japan?"
}
]
}'
In more complex scenarios (e.g. fallbacks, custom credentials), you can configure your own model and Google AI Studio (Gemini API) provider in TensorZero.
For this minimal setup, you'll need just two files in your project directory:
- config/
- tensorzero.toml
- docker-compose.yml
You can also find the complete code for this example on GitHub.
</Tip>For production deployments, see our Deployment Guide.
Create a minimal configuration file that defines a model and a simple chat function:
[models.gemini_2_5_flash_lite]
routing = ["google_ai_studio_gemini"]
[models.gemini_2_5_flash_lite.providers.google_ai_studio_gemini]
type = "google_ai_studio_gemini"
model_name = "gemini-2.5-flash-lite"
[functions.my_function_name]
type = "chat"
[functions.my_function_name.variants.my_variant_name]
type = "chat_completion"
model = "gemini_2_5_flash_lite"
See the list of models available on Google AI Studio (Gemini API).
You must set the GOOGLE_AI_STUDIO_API_KEY environment variable before running the gateway.
You can customize the credential location by setting the api_key_location to env::YOUR_ENVIRONMENT_VARIABLE or dynamic::ARGUMENT_NAME.
See the Credential Management guide and Configuration Reference for more information.
Create a minimal Docker Compose configuration:
# This is a simplified example for learning purposes. Do not use this in production.
# For production-ready deployments, see: https://www.tensorzero.com/docs/deployment/tensorzero-gateway
services:
gateway:
image: tensorzero/gateway
volumes:
- ./config:/app/config:ro
command: --config-file /app/config/tensorzero.toml
environment:
GOOGLE_AI_STUDIO_API_KEY: ${GOOGLE_AI_STUDIO_API_KEY:?Environment variable GOOGLE_AI_STUDIO_API_KEY must be set.}
ports:
- "3000:3000"
extra_hosts:
- "host.docker.internal:host-gateway"
You can start the gateway with docker compose up.
Make an inference request to the gateway:
curl -X POST http://localhost:3000/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "tensorzero::function_name::my_function_name",
"messages": [
{
"role": "user",
"content": "What is the capital of Japan?"
}
]
}'
You can generate embeddings from Google AI Studio using the OpenAI-compatible provider. For example:
[embedding_models.gemini-embedding-001]
routing = ["google_ai_studio"]
[embedding_models.gemini-embedding-001.providers.google_ai_studio]
type = "openai"
api_base = "https://generativelanguage.googleapis.com/v1beta/openai"
api_key_location = "env::GOOGLE_AI_STUDIO_API_KEY"
model_name = "gemini-embedding-001"
Gemini supports two thinking parameters:
reasoning_effort maps to thinkingConfig.thinkingLevelthinking_budget_tokens maps to thinkingConfig.thinkingBudget (legacy)