examples/docs/guides/gateway/call-llms-with-image-and-file-inputs/README.md
This directory contains the code for the Call LLMs with image & file inputs guide.
OPENAI_API_KEY environment variable:export OPENAI_API_KEY="sk-..." # Replace with your OpenAI API key
docker compose up
[!TIP]
You can use any S3-compatible object storage service (e.g. AWS S3, GCP Storage, Cloudflare R2). We use a local MinIO instance in this example for convenience.
a. Install the Python dependencies. We recommend using uv:
uv sync
b. Run the example:
uv run openai_sdk.py
a. Install the Node dependencies:
pnpm install
b. Run the example:
pnpm tsx openai_sdk.ts
Run the following commands to make a multimodal inference request to the TensorZero Gateway. The first image is a remote image of Ferris the crab, and the second image is a one-pixel orange image encoded as a base64 string.
curl -X POST http://localhost:3000/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "tensorzero::model_name::openai::gpt-4o-mini",
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "Do the images share any common features?"
},
{
"type": "image_url",
"image_url": {
"url": "https://raw.githubusercontent.com/tensorzero/tensorzero/eac2a230d4a4db1ea09e9c876e45bdb23a300364/tensorzero-core/tests/e2e/providers/ferris.png"
}
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAA1JREFUGFdj+O/P8B8ABe0CTsv8mHgAAAAASUVORK5CYII="
}
}
]
}
]
}'