Back to Tensorzero

How to call any LLM

docs/gateway/call-any-llm.mdx

2026.4.16.4 KB
Original Source

This page shows how to:

  • Call any LLM with the same API. TensorZero unifies every major LLM API (e.g. OpenAI) and inference server (e.g. Ollama).
  • Get started with a few lines of code. Later, you can optionally add observability, automatic fallbacks, A/B testing, and much more.
  • Use any programming language. You can use TensorZero with any OpenAI SDK (Python, Node, Go, etc.).
<Tip>

You can find a complete runnable example of this guide on GitHub.

</Tip> <Tabs> <Tab title="Python">

You can point the OpenAI Python SDK to a TensorZero Gateway to call any LLM with a unified API.

<Steps> <Step title="Set up the credentials for your LLM provider">

For example, if you're using OpenAI, you can set the OPENAI_API_KEY environment variable with your API key.

bash
export OPENAI_API_KEY="sk-..."
<Tip>

See the Integrations page to learn how to set up credentials for other LLM providers.

</Tip> </Step> <Step title="Install the OpenAI Python SDK">

You can install the OpenAI SDK with a Python package manager like pip.

bash
pip install openai
</Step> <Step title="Deploy the TensorZero Gateway">

Let's deploy the TensorZero Gateway using Docker. For simplicity, we'll use the gateway without observability or custom configuration.

bash
docker run \
  -e OPENAI_API_KEY \
  -p 3000:3000 \
  tensorzero/gateway \
  --default-config
<Tip>

See the Deploy the TensorZero Gateway page for more details.

</Tip> </Step> <Step title="Initialize the OpenAI client">

Let's initialize the OpenAI SDK and point it to the gateway we just launched.

python
from openai import OpenAI

client = OpenAI(base_url="http://localhost:3000/openai/v1", api_key="not-used")
</Step> <Step title="Call the LLM">
python
response = client.chat.completions.create(
    model="tensorzero::model_name::openai::gpt-5-mini",
    # or: model="tensorzero::model_name::anthropic::claude-sonnet-4-20250514"
    # or: Google, AWS, Azure, xAI, vLLM, Ollama, and many more
    messages=[
        {
            "role": "user",
            "content": "Tell me a fun fact.",
        }
    ],
)
<Accordion title="Sample Response">
python
ChatCompletion(
    id='0198d33f-24f6-7cc3-9dd0-62ba627b27db',
    choices=[
        Choice(
            finish_reason='stop',
            index=0,
            logprobs=None,
            message=ChatCompletionMessage(
                content='Sure! Did you know that octopuses have three hearts? Two pump blood to the gills, while the third pumps it to the rest of the body. And, when an octopus swims, the heart that delivers blood to the body actually **stops beating**—which is why they prefer to crawl rather than swim!',
                refusal=None,
                role='assistant',
                annotations=None,
                audio=None,
                function_call=None,
                tool_calls=[]
            )
        )
    ],
    created=1755890789,
    model='tensorzero::model_name::openai::gpt-5-mini',
    object='chat.completion',
    service_tier=None,
    system_fingerprint='',
    usage=CompletionUsage(
        completion_tokens=67,
        prompt_tokens=13,
        total_tokens=80,
        completion_tokens_details=None,
        prompt_tokens_details=None
    ),
    episode_id='0198d33f-24f6-7cc3-9dd0-62cd7028c3d7'
)
</Accordion> <Tip>

See the Inference (OpenAI) API Reference for more details on the request and response formats.

</Tip> </Step> </Steps> </Tab> <Tab title="Node">

You can point the OpenAI Node SDK to a TensorZero Gateway to call any LLM with a unified API.

<Steps> <Step title="Set up the credentials for your LLM provider">

For example, if you're using OpenAI, you can set the OPENAI_API_KEY environment variable with your API key.

bash
export OPENAI_API_KEY="sk-..."
<Tip>

See the Integrations page to learn how to set up credentials for other LLM providers.

</Tip> </Step> <Step title="Install the OpenAI Node SDK">

You can install the OpenAI SDK with a package manager like npm.

bash
npm i openai
</Step> <Step title="Deploy the TensorZero Gateway">

Let's deploy the TensorZero Gateway using Docker. For simplicity, we'll use the gateway without observability or custom configuration.

bash
docker run \
  -e OPENAI_API_KEY \
  -p 3000:3000 \
  tensorzero/gateway \
  --default-config
<Tip>

See the Deploy the TensorZero Gateway page for more details.

</Tip> </Step> <Step title="Initialize the OpenAI client">

Let's initialize the OpenAI SDK and point it to the gateway we just launched.

ts
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://localhost:3000/openai/v1",
});
</Step> <Step title="Call the LLM">
ts
const response = await client.chat.completions.create({
  model: "tensorzero::model_name::openai::gpt-5-mini",
  // or: model: "tensorzero::model_name::anthropic::claude-sonnet-4-20250514",
  // or: Google, AWS, Azure, xAI, vLLM, Ollama, and many more
  messages: [
    {
      role: "user",
      content: "Tell me a fun fact.",
    },
  ],
});
<Accordion title="Sample Response">
ts
{
  id: '0198d345-4bd5-79a2-a235-ebaea8c16d91',
  episode_id: '0198d345-4bd5-79a2-a235-ebbf6eb49cb8',
  choices: [
    {
      index: 0,
      finish_reason: 'stop',
      message: {
        content: 'Sure! Did you know that honey never spoils? Archaeologists have found pots of honey in ancient Egyptian tombs that are over 3,000 years old—and still perfectly edible!',
        tool_calls: [],
        role: 'assistant'
      }
    }
  ],
  created: 1755891192,
  model: 'tensorzero::model_name::openai::gpt-5-mini',
  system_fingerprint: '',
  service_tier: null,
  object: 'chat.completion',
  usage: { prompt_tokens: 13, completion_tokens: 37, total_tokens: 50 }
}
</Accordion> <Tip>

See the Inference (OpenAI) API Reference for more details on the request and response formats.

</Tip> </Step> </Steps> </Tab> </Tabs>

See Configure models and providers to set up multiple providers with routing and fallbacks and Configure functions and variants to manage your LLM logic with experimentation and observability.