Azure

The azure provider enables you to use Azure OpenAI Service models with Promptfoo. It shares configuration settings with the OpenAI provider.

Setup

There are three ways to authenticate with Azure OpenAI:

Option 1: API Key Authentication

Set the AZURE_API_KEY environment variable and configure your deployment:

yaml

providers:
  - id: azure:chat:deploymentNameHere
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'

Option 2: Client Credentials (Service Principal) Authentication {#service-principal}

Use an Azure Entra ID (formerly Azure AD) Service Principal instead of an API key. This is the recommended approach for production environments, CI/CD pipelines, and any scenario where you want to avoid managing API keys directly.

You'll need three values from your Service Principal's app registration in the Azure Portal:

Client ID – the Application (client) ID of your app registration
Client Secret – a secret generated under Certificates & secrets
Tenant ID – your Azure AD / Entra ID directory (tenant) ID

Set them as environment variables:

bash

export AZURE_CLIENT_ID="your-application-client-id"
export AZURE_CLIENT_SECRET="your-client-secret-value"
export AZURE_TENANT_ID="your-directory-tenant-id"

Or set them in the provider config (see full example below):

azureClientId
azureClientSecret
azureTenantId

Optionally, you can also set:

AZURE_AUTHORITY_HOST / azureAuthorityHost (defaults to https://login.microsoftonline.com)
AZURE_TOKEN_SCOPE / azureTokenScope (defaults to https://cognitiveservices.azure.com/.default)

Then configure your deployment:

yaml

providers:
  - id: azure:chat:deploymentNameHere
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'

:::tip

The Service Principal must have the Cognitive Services OpenAI User role (or equivalent) assigned on your Azure OpenAI resource. You can assign this in the Azure Portal under your resource's Access control (IAM) blade.

:::

Option 3: Azure CLI Authentication

Authenticate with Azure CLI using az login before running promptfoo. This is the fallback option if the parameters for the previous options are not provided.

Optionally, you can also set:

AZURE_TOKEN_SCOPE / azureTokenScope (defaults to 'https://cognitiveservices.azure.com/.default')

Then configure your deployment:

yaml

providers:
  - id: azure:chat:deploymentNameHere
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'

Provider Types

azure:chat:<deployment name> - For chat endpoints (e.g., gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5, gpt-4o)
azure:completion:<deployment name> - For completion endpoints (e.g., gpt-35-turbo-instruct)
azure:embedding:<deployment name> - For embedding models (e.g., text-embedding-3-small, text-embedding-3-large)
azure:responses:<deployment name> - For the Responses API (e.g., gpt-4.1, gpt-5.1)
azure:assistant:<assistant id> - For Azure OpenAI Assistants (using Azure OpenAI API)
azure:foundry-agent:<agent name or id> - For Azure AI Foundry Agents (using Azure AI Projects SDK)
azure:video:<deployment name> - For video generation (Sora)

Vision-capable GPT-5, GPT-4o, and GPT-4.1 deployments use the standard azure:chat: provider type.

Azure deployment availability changes frequently and varies by region. Check the Azure OpenAI model availability page for the current list of supported models and regions before creating new deployments.

Available Models

Azure provides access to OpenAI models as well as third-party models through Azure AI Foundry (Microsoft Foundry).

OpenAI Models

Category	Models
GPT-5 Series	`gpt-5.4`, `gpt-5.4-pro`, `gpt-5.4-mini`, `gpt-5.4-nano`, `gpt-5`, `gpt-5-pro`, `gpt-5-mini`, `gpt-5-nano`, `gpt-5.1`, `gpt-5.1-chat`, `gpt-5.1-codex`
GPT-4.1 Series	`gpt-4.1`, `gpt-4.1-mini`, `gpt-4.1-nano`
GPT-4o Series	`gpt-4o`, `gpt-4o-mini`, `gpt-4o-realtime`
Reasoning Models	`o1`, `o1-mini`, `o1-pro`, `o3`, `o3-mini`, `o3-pro`, `o4-mini`
Specialized	`computer-use-preview`, `gpt-image-1`, `codex-mini-latest`
Deep Research	`o3-deep-research`, `o4-mini-deep-research`
Embeddings	`text-embedding-3-small`, `text-embedding-3-large`, `text-embedding-ada-002`

Third-Party Models (Azure AI Foundry)

Azure AI Foundry provides access to models from multiple providers:

Provider	Models
Anthropic Claude	`claude-opus-4-7`, `claude-opus-4-6-20260205`, `claude-sonnet-4-6`, `claude-opus-4-5-20251101`, `claude-sonnet-4-5-20250929`, `claude-haiku-4-5-20251001`, `claude-3-5-sonnet-20241022`, `claude-3-5-haiku-20241022` — see Using Claude Models for deployment and config details
Meta Llama	`Llama-4-Scout-17B-16E-Instruct`, `Llama-4-Maverick-17B-128E-Instruct-FP8`, `Llama-3.3-70B-Instruct`, `Meta-Llama-3.1-405B-Instruct`, `Meta-Llama-3.1-70B-Instruct`, `Meta-Llama-3.1-8B-Instruct`
DeepSeek	`DeepSeek-R1` (reasoning), `DeepSeek-V3`, `DeepSeek-R1-Distill-Llama-70B`, `DeepSeek-R1-Distill-Qwen-32B`
Mistral	`Mistral-Large-2411`, `Pixtral-Large-2411`, `Ministral-3B-2410`, `Mistral-Nemo-2407`
Cohere	`Cohere-command-a-03-2025`, `command-r-plus-08-2024`, `command-r-08-2024`
Microsoft Phi	`Phi-4`, `Phi-4-mini-instruct`, `Phi-4-reasoning`, `Phi-4-mini-reasoning`
xAI Grok	`grok-3`, `grok-3-mini`, `grok-3-reasoning`, `grok-3-mini-reasoning`, `grok-2-vision-1212`
AI21	`AI21-Jamba-1.5-Large`, `AI21-Jamba-1.5-Mini`
Core42	`JAIS-70b-chat`, `Falcon3-7B-Instruct`

For the complete list of 200+ models with pricing, see the Azure model catalog.

Azure Responses API

The Azure OpenAI Responses API is a stateful API that brings together the best capabilities from chat completions and assistants API in one unified experience. It provides advanced features like MCP servers, code interpreter, and background tasks.

Using the Responses API

To use the Azure Responses API with promptfoo, use the azure:responses provider type:

yaml

providers:
  # Using the azure:responses alias (recommended)
  # Note: deployment name must match your Azure deployment, not the model name
  - id: azure:responses:my-gpt-4-1-deployment
    config:
      temperature: 0.7
      instructions: 'You are a helpful assistant.'
      response_format: file://./response-schema.json
      # For newer v1 API, use 'preview' (default)
      # For legacy API, use specific version like '2025-04-01-preview'
      apiVersion: 'preview'

  # Or using openai:responses with Azure configuration (legacy method)
  - id: openai:responses:gpt-4.1
    config:
      apiHost: 'your-resource.openai.azure.com'
      apiKey: '{{ env.AZURE_API_KEY }}' # or set OPENAI_API_KEY env var
      temperature: 0.7
      instructions: 'You are a helpful assistant.'

Supported Responses Models

The Responses API supports Azure deployments backed by current Azure OpenAI responses-capable models. Common examples include:

GPT-5 Series: gpt-5.4, gpt-5.4-pro, gpt-5.4-mini, gpt-5.4-nano, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.1
GPT-4 Series: gpt-4o, gpt-4o-mini, gpt-4.1, gpt-4.1-mini, gpt-4.1-nano
Reasoning Models: o1, o1-mini, o1-pro, o3, o3-mini, o3-pro, o4-mini
Specialized Models: computer-use-preview, gpt-image-1, codex-mini-latest
Deep Research Models: o3-deep-research, o4-mini-deep-research

Use your Azure deployment name in promptfoo, even if it differs from the underlying model ID.

Responses API Features

Response Format with External Files

Load complex JSON schemas from external files for better organization:

yaml

providers:
  - id: openai:responses:gpt-4.1
    config:
      apiHost: 'your-resource.openai.azure.com'
      response_format: file://./schemas/response-schema.json

Example response-schema.json:

json

{
  "type": "json_schema",
  "name": "structured_output",
  "schema": {
    "type": "object",
    "properties": {
      "result": { "type": "string" },
      "confidence": { "type": "number" }
    },
    "required": ["result", "confidence"],
    "additionalProperties": false
  }
}

You can also use nested file references for the schema itself:

json

{
  "type": "json_schema",
  "name": "structured_output",
  "schema": "file://./schemas/output-schema.json"
}

Variable rendering is supported in file paths:

yaml

config:
  response_format: file://./schemas/{{ schema_name }}.json

Advanced Configuration

Instructions: Provide system-level instructions to guide model behavior:

yaml

config:
  instructions: 'You are a helpful assistant specializing in technical documentation.'

Background Tasks: Enable asynchronous processing for long-running tasks:

yaml

config:
  background: true
  store: true

Chaining Responses: Chain multiple responses together for multi-turn conversations:

yaml

config:
  previous_response_id: '{{previous_id}}'

MCP Servers: Connect to remote MCP servers for extended tool capabilities:

yaml

config:
  tools:
    - type: mcp
      server_label: github
      server_url: https://example.com/mcp-server
      require_approval: never
      headers:
        Authorization: 'Bearer {{ env.MCP_API_KEY }}'

Code Interpreter: Enable code execution capabilities:

yaml

config:
  tools:
    - type: code_interpreter
      container:
        type: auto

Web Search: Enable web search capabilities:

yaml

config:
  tools:
    - type: web_search_preview

Image Generation: Use image generation with supported models:

yaml

config:
  tools:
    - type: image_generation
      partial_images: 2 # For streaming partial images

Complete Responses API Example

Here's a comprehensive example using multiple Azure Responses API features:

yaml

# promptfooconfig.yaml
description: Azure Responses API evaluation

providers:
  # Using the new azure:responses alias (recommended)
  - id: azure:responses:gpt-4.1-deployment
    label: azure-gpt-4.1
    config:
      temperature: 0.7
      max_output_tokens: 2000
      instructions: 'You are a helpful AI assistant.'
      response_format: file://./response-format.json
      tools:
        - type: code_interpreter
          container:
            type: auto
        - type: web_search_preview
      metadata:
        session: 'eval-001'
        user: 'test-user'
      store: true

  # Reasoning model example
  - id: azure:responses:o3-mini-deployment
    label: azure-reasoning
    config:
      reasoning_effort: medium
      max_completion_tokens: 4000

prompts:
  - 'Analyze this data and provide insights: {{data}}'
  - 'Write a Python function to solve: {{problem}}'

tests:
  - vars:
      data: 'Sales increased by 25% in Q3 compared to Q2'
    assert:
      - type: contains
        value: 'growth'
      - type: contains
        value: '25%'

  - vars:
      problem: 'Calculate fibonacci sequence up to n terms'
    assert:
      - type: javascript
        value: 'output.includes("def fibonacci") || output.includes("function fibonacci")'
      - type: contains
        value: 'recursive'

Additional Responses API Configuration

Streaming: Enable streaming for real-time output:

yaml

config:
  stream: true

Parallel Tool Calls: Allow multiple tool calls in parallel:

yaml

config:
  parallel_tool_calls: true
  max_tool_calls: 5

Truncation: Configure how input is truncated when it exceeds limits:

yaml

config:
  truncation: auto # or 'disabled'

Webhook URL: Set a webhook for async notifications:

yaml

config:
  webhook_url: 'https://your-webhook.com/callback'

Responses API Limitations

Web search tool support is still in development
PDF file upload with purpose: user_data requires workaround (use purpose: assistants)
Background mode requires store: true
Some features may have region-specific availability

Environment Variables

The Azure OpenAI provider supports the following environment variables:

Environment Variable	Config Key	Description	Required
`AZURE_API_KEY`	`apiKey`	Your Azure OpenAI API key	No*
`AZURE_API_HOST`	`apiHost`	API host	No
`AZURE_API_BASE_URL`	`apiBaseUrl`	API base URL	No
`AZURE_BASE_URL`	`apiBaseUrl`	Alternative API base URL	No
`AZURE_DEPLOYMENT_NAME`	-	Default deployment name	Yes
`AZURE_CLIENT_ID`	`azureClientId`	Azure AD application client ID	No*
`AZURE_CLIENT_SECRET`	`azureClientSecret`	Azure AD application client secret	No*
`AZURE_TENANT_ID`	`azureTenantId`	Azure AD tenant ID	No*
`AZURE_AUTHORITY_HOST`	`azureAuthorityHost`	Azure AD authority host	No
`AZURE_TOKEN_SCOPE`	`azureTokenScope`	Azure AD token scope	No

* Either AZURE_API_KEY OR the combination of AZURE_CLIENT_ID, AZURE_CLIENT_SECRET, and AZURE_TENANT_ID must be provided.

Note: For API URLs, you only need to set one of AZURE_API_HOST, AZURE_API_BASE_URL, or AZURE_BASE_URL. If multiple are set, the provider will use them in that order of preference.

Default Deployment

If AZURE_DEPLOYMENT_NAME is set, it will be automatically used as the default deployment when no other provider is configured. This makes Azure OpenAI the default provider when:

No OpenAI API key is present (OPENAI_API_KEY is not set)
Azure authentication is configured (either via API key or client credentials)
AZURE_DEPLOYMENT_NAME is set

For example, if you have these environment variables set:

bash

AZURE_DEPLOYMENT_NAME=gpt-4o
AZURE_API_KEY=your-api-key
AZURE_API_HOST=your-host.openai.azure.com

Or these client credential environment variables:

bash

AZURE_DEPLOYMENT_NAME=gpt-4o
AZURE_CLIENT_ID=your-client-id
AZURE_CLIENT_SECRET=your-client-secret
AZURE_TENANT_ID=your-tenant-id
AZURE_API_HOST=your-host.openai.azure.com

Then Azure OpenAI will be used as the default provider for all operations including:

Dataset generation
Grading
Suggestions
Synthesis

Embedding Models

Because embedding models are distinct from text generation models, to set a default embedding provider you must specify AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME.

Set this environment variable to the deployment name of your embedding model:

bash

AZURE_OPENAI_EMBEDDING_DEPLOYMENT_NAME=text-embedding-3-small

This deployment will automatically be used whenever embeddings are required, such as for similarity comparisons or dataset generation. You can also override the embedding provider in your configuration:

yaml

defaultTest:
  options:
    provider:
      embedding:
        id: azure:embedding:text-embedding-3-small-deployment
        config:
          apiHost: 'your-resource.openai.azure.com'

Note that any moderation tasks will still use the OpenAI API.

Configuration

The YAML configuration can override environment variables and set additional parameters:

yaml

providers:
  - id: azure:chat:deploymentNameHere
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'
      # Authentication (Option 1: API Key)
      apiKey: 'your-api-key'

      # Authentication (Option 2: Client Credentials)
      azureClientId: 'your-azure-client-id'
      azureClientSecret: 'your-azure-client-secret'
      azureTenantId: 'your-azure-tenant-id'
      azureAuthorityHost: 'https://login.microsoftonline.com' # Optional
      azureTokenScope: 'https://cognitiveservices.azure.com/.default' # Optional

      # OpenAI parameters
      temperature: 0.5
      max_tokens: 1024

:::tip

All other OpenAI provider environment variables and configuration properties are supported.

:::

Using Client Credentials (Service Principal) {#using-client-credentials}

If you want to authenticate with a Service Principal (SPN) instead of an API key, follow these steps.

Prerequisites

Register an application in Azure Entra ID (formerly Azure AD) to create a Service Principal.
Create a client secret for the app registration under Certificates & secrets.
Assign the role Cognitive Services OpenAI User (or Cognitive Services Contributor) to the Service Principal on your Azure OpenAI resource. Go to your resource's Access control (IAM) > Add role assignment.
Install the @azure/identity package — promptfoo uses it to obtain tokens from Azure Entra ID:

npm install @azure/identity

Configuration

You can provide the Service Principal credentials via environment variables or directly in the YAML config.

Using environment variables (recommended for CI/CD and production):

bash

export AZURE_CLIENT_ID="00000000-0000-0000-0000-000000000000"   # Application (client) ID
export AZURE_CLIENT_SECRET="your-client-secret-value"            # Client secret
export AZURE_TENANT_ID="00000000-0000-0000-0000-000000000000"   # Directory (tenant) ID

yaml

providers:
  - id: azure:chat:my-gpt-4o-deployment
    config:
      apiHost: 'your-resource.openai.azure.com'

Using inline config (useful for local testing):

yaml

providers:
  - id: azure:chat:my-gpt-4o-deployment
    config:
      apiHost: 'your-resource.openai.azure.com'
      azureClientId: '00000000-0000-0000-0000-000000000000'
      azureClientSecret: 'your-client-secret-value'
      azureTenantId: '00000000-0000-0000-0000-000000000000'
      azureAuthorityHost: 'https://login.microsoftonline.com' # Optional
      azureTokenScope: 'https://cognitiveservices.azure.com/.default' # Optional

How It Works

When client credentials are provided, promptfoo uses the @azure/identity library to create a ClientSecretCredential and requests an access token scoped to Azure Cognitive Services (https://cognitiveservices.azure.com/.default). The token is then sent as a Bearer token in the Authorization header instead of an API key.

If neither an API key nor client credentials are provided, promptfoo falls back to AzureCliCredential (i.e., your az login session) — see Option 3.

The azureAuthorityHost defaults to https://login.microsoftonline.com if not specified. The azureTokenScope defaults to https://cognitiveservices.azure.com/.default, the scope required to authenticate with Azure Cognitive Services. You typically don't need to change these unless you're working with a sovereign cloud (e.g., Azure Government or Azure China).

Model-Graded Tests

Model-graded assertions such as factuality or llm-rubric use gpt-5 by default. When AZURE_DEPLOYMENT_NAME is set (and OPENAI_API_KEY is not), promptfoo automatically uses the specified Azure deployment for grading. You can also explicitly override the grader as shown below.

The easiest way to do this for all your test cases is to add the defaultTest property to your config:

yaml

defaultTest:
  options:
    provider:
      id: azure:chat:gpt-4o-deployment
      config:
        apiHost: 'xxxxxxx.openai.azure.com'

However, you can also do this for individual assertions:

yaml

# ...
assert:
  - type: llm-rubric
    value: Do not mention that you are an AI or chat assistant
    provider:
      id: azure:chat:xxxx
      config:
        apiHost: 'xxxxxxx.openai.azure.com'

Or individual tests:

yaml

# ...
tests:
  - vars:
      # ...
    options:
      provider:
        id: azure:chat:xxxx
        config:
          apiHost: 'xxxxxxx.openai.azure.com'
    assert:
      - type: llm-rubric
        value: Do not mention that you are an AI or chat assistant

Using Text and Embedding Providers for Different Assertion Types

When you have tests that use both text-based assertions (like llm-rubric, answer-relevance) and embedding-based assertions (like similar), you can configure different Azure deployments for each type using the provider type map pattern:

yaml

defaultTest:
  options:
    provider:
      # Text provider for llm-rubric, answer-relevance, factuality, etc.
      text:
        id: azure:chat:o4-mini-deployment
        config:
          apiHost: 'text-models.openai.azure.com'

      # Embedding provider for similarity assertions
      embedding:
        id: azure:embedding:text-embedding-3-large
        config:
          apiHost: 'embedding-models.openai.azure.com'

Similarity

The similar assertion type requires an embedding model such as text-embedding-3-large or text-embedding-3-small. Be sure to specify a deployment with an embedding model, not a chat model, when overriding the grader.

For example, override the embedding deployment in your config:

yaml

defaultTest:
  options:
    provider:
      embedding:
        id: azure:embedding:text-embedding-3-small-deployment
        config:
          apiHost: 'your-resource.openai.azure.com'

AI Services

You may also specify data_sources to integrate with the Azure AI Search API.

yaml

providers:
  - id: azure:chat:deploymentNameHere
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'
      deployment_id: 'abc123'
data_sources:
 - type: azure_search
  parameters:
   endpoint: https://xxxxxxxx.search.windows.net
    index_name: index123
     authentication:
      type: api_key
       key: ''

:::note

For legacy Azure OpenAI API versions before 2024-02-15-preview, you can also specify deployment_id and dataSources, used to integrate with the Azure AI Search API.

yaml

providers:
  - id: azure:chat:deploymentNameHere
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'
      deployment_id: 'abc123'
      dataSources:
        - type: AzureCognitiveSearch
          parameters:
            endpoint: '...'
            key: '...'
            indexName: '...'

:::

Configuration Reference

These properties can be set under the provider config key:

General Configuration

Name	Description
apiHost	API host (e.g., `yourresource.openai.azure.com`)
apiBaseUrl	Base URL of the API (used instead of host)
apiKey	API key for authentication
apiVersion	API version. Use `2024-10-21` or newer for vision support

Azure-Specific Configuration

Name	Description
azureClientId	Azure identity client ID
azureClientSecret	Azure identity client secret
azureTenantId	Azure identity tenant ID
azureAuthorityHost	Azure identity authority host
azureTokenScope	Azure identity token scope
deployment_id	Azure cognitive services deployment ID
dataSources	Azure cognitive services parameter for specifying data sources

OpenAI Configuration

Name	Description
o1	Set to `true` if your Azure deployment uses an o1 model. (Deprecated, use `isReasoningModel` instead)
isReasoningModel	Set to `true` if your Azure deployment uses a reasoning model (o1, o3, o3-mini, o4-mini). Required for reasoning models
max_completion_tokens	Maximum tokens to generate for reasoning models. Only used when `isReasoningModel` is `true`
reasoning_effort	Controls reasoning depth: 'low', 'medium', or 'high'. Only used when `isReasoningModel` is `true`
temperature	Controls randomness (0-2). Not supported for reasoning models
max_tokens	Maximum tokens to generate. Not supported for reasoning models
top_p	Controls nucleus sampling (0-1)
frequency_penalty	Penalizes repeated tokens (-2 to 2)
presence_penalty	Penalizes new tokens based on presence (-2 to 2)
omitDefaults	Omits hardcoded defaults unless values are explicitly set via config or environment variables. Supported by `azure:chat` and `azure:responses`.
best_of	Generates multiple outputs and returns the best
functions	Array of functions available for the model to call
function_call	Controls how the model calls functions
response_format	Specifies output format (e.g., `{ type: "json_object" }`)
stop	Array of sequences where the model will stop generating
passthrough	Additional parameters to send with the request

Using Reasoning Models (o1, o3, o3-mini, o4-mini)

Azure OpenAI now supports reasoning models like o1, o3, o3-mini, and o4-mini. These models operate differently from standard models with specific requirements:

They use max_completion_tokens instead of max_tokens
They don't support temperature (it's ignored)
They accept a reasoning_effort parameter ('low', 'medium', 'high')

Since Azure allows custom deployment names that don't necessarily reflect the underlying model type, you must explicitly set the isReasoningModel flag to true in your configuration when using reasoning models. This works with both chat and completion endpoints:

yaml

# For chat endpoints
providers:
  - id: azure:chat:my-o4-mini-deployment
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'
      # Set this flag to true for reasoning models (o1, o3, o3-mini, o4-mini)
      isReasoningModel: true
      # Use max_completion_tokens instead of max_tokens
      max_completion_tokens: 25000
      # Optional: Set reasoning effort (default is 'medium' unless omitDefaults is true)
      reasoning_effort: 'medium'

# For completion endpoints
providers:
  - id: azure:completion:my-o3-deployment
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'
      isReasoningModel: true
      max_completion_tokens: 25000
      reasoning_effort: 'high'

Note: The o1 flag is still supported for backward compatibility, but isReasoningModel is preferred as it more clearly indicates its purpose.

Using Variables with Reasoning Effort

You can use variables in your configuration to dynamically adjust the reasoning effort based on your test cases:

yaml

# Configure different reasoning efforts based on test variables
prompts:
  - 'Solve this complex math problem: {{problem}}'

providers:
  - id: azure:chat:my-o4-mini-deployment
    config:
      apiHost: 'xxxxxxxx.openai.azure.com'
      isReasoningModel: true
      max_completion_tokens: 25000
      # This will be populated from the test case variables
      reasoning_effort: '{{effort_level}}'

tests:
  - vars:
      problem: 'What is the integral of x²?'
      effort_level: 'low'
  - vars:
      problem: 'Prove the Riemann hypothesis'
      effort_level: 'high'

Troubleshooting

If you encounter this error when using reasoning models:

API response error: unsupported_parameter Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead.

This means you're using a reasoning model without setting the isReasoningModel flag. Update your config as shown above.

Using Vision Models

Azure OpenAI supports vision-capable models like GPT-5.1, GPT-4o, and GPT-4.1 for image analysis.

Configuration

yaml

providers:
  - id: azure:chat:gpt-4o
    config:
      apiHost: 'your-resource-name.openai.azure.com'
      apiVersion: '2024-10-21' # or newer for vision support

Image Input

Vision models require a specific message format. Images can be provided as:

URLs: Direct image links
Local files: Using file:// paths (automatically converted to base64)
Base64: Data URIs with format data:image/jpeg;base64,YOUR_DATA

yaml

prompts:
  - |
    [
      {
        "role": "user",
        "content": [
          {
            "type": "text",
            "text": "What do you see in this image?"
          },
          {
            "type": "image_url",
            "image_url": {
              "url": "{{image_url}}"
            }
          }
        ]
      }
    ]

tests:
  - vars:
      image_url: https://example.com/image.jpg # URL
  - vars:
      image_url: file://assets/image.jpg # Local file (auto base64)
  - vars:
      image_url: data:image/jpeg;base64,/9j/4A... # Base64

Example

See the Azure OpenAI example for a complete working example with image analysis. Use promptfooconfig.vision.yaml for vision-specific features.

Using Claude Models

Azure AI Foundry exposes Claude through two endpoint families. Pick the one that matches how you want to manage the model.

Option 1 (recommended): Anthropic Messages endpoint

Per Anthropic's own Foundry integration, every Claude deployment publishes a native Messages endpoint at https://<resource>.services.ai.azure.com/anthropic/v1/messages. Point promptfoo's anthropic:messages provider at that base URL and you get the full Anthropic provider feature set — adaptive thinking, xhigh effort, automatic Opus 4.7 temperature suppression, and consistent pricing across Anthropic/Bedrock/Vertex:

yaml

providers:
  - id: anthropic:messages:claude-opus-4-7
    config:
      apiBaseUrl: 'https://<resource>.services.ai.azure.com/anthropic'
      apiKey: '{{env.AZURE_FOUNDRY_API_KEY}}'
      max_tokens: 1024

Promptfoo appends /v1/messages to the base URL automatically, so set apiBaseUrl to the https://…/anthropic prefix shown above.

Option 2: Azure OpenAI-compatible chat endpoint

The same deployment also accepts OpenAI-style chat completion requests. Use this if you want a single provider type across Azure Claude and Azure OpenAI deployments:

yaml

providers:
  - id: azure:chat:claude-opus-4-7
    config:
      apiHost: 'your-deployment.services.ai.azure.com'
      apiVersion: '2025-04-01-preview'
      max_tokens: 4096

Opus 4.7 deployments automatically omit temperature from the request body on this path too.

Available Claude deployments on Azure AI Foundry:

Model	Description
`claude-opus-4-7`	Claude Opus 4.7
`claude-opus-4-6-20260205`	Claude Opus 4.6
`claude-sonnet-4-6`	Claude Sonnet 4.6
`claude-opus-4-5-20251101`	Claude Opus 4.5
`claude-sonnet-4-5-20250929`	Claude Sonnet 4.5
`claude-haiku-4-5-20251001`	Claude Haiku 4.5
`claude-3-5-sonnet-20241022`	Claude 3.5 Sonnet
`claude-3-5-haiku-20241022`	Claude 3.5 Haiku

Claude Configuration Example

yaml

description: Azure Claude evaluation

providers:
  - id: anthropic:messages:claude-opus-4-7
    label: claude-opus-4-7
    config:
      apiBaseUrl: 'https://<resource>.services.ai.azure.com/anthropic'
      apiKey: '{{env.AZURE_FOUNDRY_API_KEY}}'
      max_tokens: 1024
      effort: xhigh

prompts:
  - 'Explain {{concept}} in simple terms.'

tests:
  - vars:
      concept: quantum computing
    assert:
      - type: contains-any
        value: ['qubit', 'superposition']

Using Llama Models

Azure AI Foundry provides access to Meta's Llama models, including Llama 4:

yaml

providers:
  - id: azure:chat:Llama-4-Maverick-17B-128E-Instruct-FP8
    config:
      apiHost: 'your-deployment.services.ai.azure.com'
      apiVersion: '2025-04-01-preview'
      max_tokens: 4096

Available Llama models include:

Llama-4-Maverick-17B-128E-Instruct-FP8 - Llama 4 Maverick (128 experts)
Llama-4-Scout-17B-16E-Instruct - Llama 4 Scout (16 experts)
Llama-3.3-70B-Instruct - Llama 3.3 70B
Meta-Llama-3.1-405B-Instruct - Llama 3.1 405B
Meta-Llama-3.1-70B-Instruct - Llama 3.1 70B
Meta-Llama-3.1-8B-Instruct - Llama 3.1 8B

Using DeepSeek Models

Azure AI supports DeepSeek models such as DeepSeek-R1. Like other reasoning models, these require specific configuration:

Set isReasoningModel: true
Use max_completion_tokens instead of max_tokens
Set API version to '2025-04-01-preview' (or latest available)

yaml

providers:
  - id: azure:chat:DeepSeek-R1
    config:
      apiHost: 'your-deployment-name.services.ai.azure.com'
      apiVersion: '2025-04-01-preview'
      isReasoningModel: true
      max_completion_tokens: 2048
      reasoning_effort: 'medium' # Options: low, medium, high

For model-graded assertions, you can configure your defaultTest to use the same provider:

yaml

defaultTest:
  options:
    provider:
      id: azure:chat:DeepSeek-R1
      config:
        apiHost: 'your-deployment-name.services.ai.azure.com'
        apiVersion: '2025-04-01-preview'
        isReasoningModel: true
        max_completion_tokens: 2048

Adjust reasoning_effort to control response quality vs. speed: low for faster responses, medium for balanced performance (default), or high for more thorough reasoning on complex tasks.

Assistants

To evaluate an OpenAI assistant on Azure:

Create a deployment for the assistant in the Azure portal
Create an assistant in the Azure web UI
Install the @azure/openai-assistants package:

npm i @azure/openai-assistants

Configure your provider with the assistant ID:

yaml

providers:
  - id: azure:assistant:asst_E4GyOBYKlnAzMi19SZF2Sn8I
    config:
      apiHost: yourdeploymentname.openai.azure.com

Replace the assistant ID and deployment name with your actual values.

Function Tools with Assistants

Azure OpenAI Assistants support tool calling. Define tool schemas via tools and provide callback implementations via functionToolCallbacks to handle invocations.

yaml

providers:
  - id: azure:assistant:your_assistant_id
    config:
      apiHost: your-resource-name.openai.azure.com
      # Load function tool definition
      tools: file://tools/weather-function.json
      # Define function callback inline
      functionToolCallbacks:
        # Use an external file
        get_weather: file://callbacks/weather.js:getWeather
        # Or use an inline function
        get_weather: |
          async function(args) {
            try {
              const parsedArgs = JSON.parse(args);
              const location = parsedArgs.location;
              const unit = parsedArgs.unit || 'celsius';
              // Function implementation...
              return JSON.stringify({
                location,
                temperature: 22,
                unit,
                condition: 'sunny'
              });
            } catch (error) {
              return JSON.stringify({ error: String(error) });
            }
          }

Using Vector Stores with Assistants

Azure OpenAI Assistants support vector stores for enhanced file search capabilities. To use a vector store:

Create a vector store in the Azure Portal or via the API
Configure your assistant to use it:

yaml

providers:
  - id: azure:assistant:your_assistant_id
    config:
      apiHost: your-resource-name.openai.azure.com
      # Add tools for file search
      tools:
        - type: file_search
      # Configure vector store IDs
      tool_resources:
        file_search:
          vector_store_ids:
            - 'your_vector_store_id'
      # Optional parameters
      temperature: 1
      top_p: 1
      apiVersion: '2025-04-01-preview'

Key requirements:

Set up a tool with type: file_search
Configure the tool_resources.file_search.vector_store_ids array with your vector store IDs
Set the appropriate apiVersion (recommended: 2025-04-01-preview or later)

Simple Example

Here's an example of a simple full assistant eval:

yaml

prompts:
  - 'Write a tweet about {{topic}}'

providers:
  - id: azure:assistant:your_assistant_id
    config:
      apiHost: your-resource-name.openai.azure.com

tests:
  - vars:
      topic: bananas

For complete working examples of Azure OpenAI Assistants with various tool configurations, check out the Azure Assistant example directory.

See the guide on How to evaluate OpenAI assistants for more information on how to compare different models, instructions, and more.

Azure AI Foundry Agents

Azure AI Foundry Agents let promptfoo run an existing Foundry agent through the Azure AI Projects SDK (@azure/ai-projects) and the v2 agent runtime. Promptfoo resolves the agent from your Azure AI Foundry project, then calls the Responses API with an agent_reference.

Key Differences from Standard Azure Assistants

Feature	Azure Assistant	Azure Foundry Agent
API Type	Direct HTTP calls to Azure OpenAI API	Azure AI Projects SDK (`@azure/ai-projects`) + Responses API agent runtime
Authentication	API key or Azure credentials	`DefaultAzureCredential` (Azure CLI, environment variables, managed identity)
Endpoint	Azure OpenAI endpoint (`*.openai.azure.com`)	Azure AI Project URL (`.services.ai.azure.com/api/projects/`)
Provider Format	`azure:assistant:<assistant_id>`	`azure:foundry-agent:<agent-name-or-id>`
Execution Model	Threads/messages/runs	`responses.create(..., { body: { agent: { name, type: "agent_reference" } } })`

Setup

Install the required Azure SDK packages:

bash

npm install @azure/ai-projects @azure/identity

Authenticate using one of these methods:
- Azure CLI (recommended for local development): Run az login
- Environment variables: Set Azure service principal credentials
- Managed Identity: Automatic in Azure-hosted environments
- Service Principal: Configure via environment variables
Set your Azure AI Project URL:

bash

export AZURE_AI_PROJECT_URL="https://your-project.services.ai.azure.com/api/projects/your-project-id"

Alternatively, you can provide the projectUrl in your configuration file.

Basic Configuration

The provider uses the azure:foundry-agent:<agent-name-or-id> format. Agent names are preferred. Legacy IDs still work as a fallback lookup if the agent exists in the project.

yaml

providers:
  - id: azure:foundry-agent:my-foundry-agent
    config:
      projectUrl: 'https://your-project.services.ai.azure.com/api/projects/your-project-id'
      temperature: 0.7
      max_tokens: 150
      instructions: 'You are a helpful assistant that provides clear and concise answers.'

Configuration Options

This provider references an existing Foundry agent. Some settings can still be sent per request through the Responses API, while agent-definition settings need to be configured on the agent itself.

Supported per-request settings:

Parameter	Description
`projectUrl`	Azure AI Project URL (required, can also use `AZURE_AI_PROJECT_URL` env var)
`instructions`	Additional per-request instructions
`temperature`	Controls randomness
`top_p`	Nucleus sampling parameter
`max_tokens`	Mapped to `max_output_tokens` for the Responses API
`max_completion_tokens`	Also mapped to `max_output_tokens`
`response_format`	Output format (`json_object` or `json_schema`)
`tools`	Tool definitions loaded into the request
`tool_choice`	Tool selection strategy
`functionToolCallbacks`	Callback implementations for `function_call` outputs
`modelName`	Optional per-request model override
`reasoning_effort`	Sent as `reasoning.effort`
`verbosity`	Passed through to the Responses text config
`metadata`	Request metadata
`passthrough`	Additional raw Responses API fields
`maxPollTimeMs`	Maximum time to keep resolving callback loops before timing out (default: `300000`)

Ignored per-request settings:

tool_resources
frequency_penalty
presence_penalty
seed
stop
timeoutMs
retryOptions

Configure those on the Foundry agent definition itself instead of on the eval request.

Function Tools with Azure Foundry Agents

Promptfoo can handle Responses API function_call outputs for Foundry agents. If every requested function has a configured callback, promptfoo executes the callbacks locally and sends function_call_output items back with previous_response_id.

You can define callbacks at the provider level or override them per prompt:

yaml

providers:
  - id: azure:foundry-agent:my-foundry-agent
    config:
      projectUrl: 'https://your-project.services.ai.azure.com/api/projects/your-project-id'
      tools: file://tools/weather-function.json
      functionToolCallbacks:
        get_current_weather: file://callbacks/weather.js:getCurrentWeather
        get_forecast: |
          async function(args) {
            try {
              const parsedArgs = JSON.parse(args);
              const location = parsedArgs.location;
              const days = parsedArgs.days || 7;

              // Your implementation here
              return JSON.stringify({
                location,
                forecast: [
                  { day: 'Monday', temperature: 72, condition: 'sunny' },
                  { day: 'Tuesday', temperature: 68, condition: 'cloudy' }
                ]
              });
            } catch (error) {
              return JSON.stringify({ error: String(error) });
            }
          }

The function callbacks receive two parameters:

args: JSON-encoded function arguments
context: { threadId, runId, assistantId, provider }

If a callback is missing, promptfoo returns the unresolved function call in the model output instead of trying to fake a tool result.

Agent-Defined Tools and Resources

Foundry agent tools such as file search and vector stores should be configured on the agent in Azure AI Foundry. The v2 runtime does not apply tool_resources from the eval request.

yaml

providers:
  - id: azure:foundry-agent:my-foundry-agent
    config:
      projectUrl: 'https://your-project.services.ai.azure.com/api/projects/your-project-id'
      tools:
        - type: file_search
      temperature: 1
      top_p: 1

In that example, the request tells the runtime that file search is available, but the actual vector store bindings still need to live on the Foundry agent definition.

Environment Variables

Variable	Description
`AZURE_AI_PROJECT_URL`	Your Azure AI Project URL (can be overridden in config)
`AZURE_CLIENT_ID`	Azure service principal client ID (for service principal auth)
`AZURE_CLIENT_SECRET`	Azure service principal secret (for service principal auth)
`AZURE_TENANT_ID`	Azure tenant ID (for service principal auth)

Complete Example

Here's a complete example configuration:

yaml

description: 'Azure Foundry Agent evaluation'

providers:
  - id: azure:foundry-agent:my-foundry-agent
    config:
      projectUrl: 'https://my-project.services.ai.azure.com/api/projects/my-project-id'
      temperature: 0.7
      max_tokens: 150
      instructions: 'You are a helpful assistant that provides clear and concise answers.'

prompts:
  - '{{question}}'

tests:
  - vars:
      question: 'What is the capital of France?'
    assert:
      - type: contains
        value: 'Paris'

  - vars:
      question: 'Explain what photosynthesis is in simple terms.'
    assert:
      - type: contains
        value: 'plants'
      - type: contains
        value: 'sunlight'

Error Handling

The Azure Foundry Agent provider includes comprehensive error handling:

Content Filter Detection: Automatically detects and reports content filtering events with guardrails metadata
Rate Limit Handling: Identifies rate limit errors for proper retry handling
Service Error Detection: Detects transient service errors (500, 502, 503, 504)
Timeout Management: Configurable polling timeout via maxPollTimeMs

Caching

The provider supports caching to improve performance and reduce API calls. Results are cached based on:

Request configuration (instructions, model override, temperature, etc.)
Tool definitions
Input prompt

Caching is enabled by default. To explicitly configure it in your configuration:

yaml

evaluateOptions:
  cache: true

providers:
  - id: azure:foundry-agent:my-foundry-agent
    config:
      projectUrl: 'https://your-project.services.ai.azure.com/api/projects/your-project-id'

When to Use Azure Foundry Agents

Use Azure Foundry Agents when:

You're working within Azure AI Foundry projects
You prefer native Azure SDK authentication (DefaultAzureCredential)
You're using managed identities or service principals for authentication
You want to leverage Azure AI Projects features

Use standard Azure Assistants when:

You're using Azure OpenAI Service directly (not through AI Foundry)
You have an existing Azure OpenAI resource and endpoint
You prefer API key-based authentication

Example Repository

For complete working examples, check out the Azure Foundry Agent example directory.

Video Generation (Sora)

Azure AI Foundry provides access to OpenAI's Sora video generation model for text-to-video and image-to-video generation.

Prerequisites

An Azure AI Foundry resource in a supported region (eastus2 or swedencentral)
A Sora model deployment

Configuration

yaml

providers:
  - id: azure:video:sora
    config:
      apiBaseUrl: https://your-resource.cognitiveservices.azure.com
      # Authentication (choose one):
      apiKey: ${AZURE_API_KEY} # Or use AZURE_API_KEY env var
      # Or use Entra ID (DefaultAzureCredential)

      # Video parameters
      width: 1280 # 480, 720, 854, 1080, 1280, 1920
      height: 720 # 480, 720, 1080
      n_seconds: 5 # 5, 10, 15, 20

      # Polling
      poll_interval_ms: 10000
      max_poll_time_ms: 600000

Supported Dimensions

Size	Aspect Ratio
480x480	1:1 (Square)
720x720	1:1 (Square)
1080x1080	1:1 (Square)
854x480	16:9 (Landscape)
1280x720	16:9 (Landscape)
1920x1080	16:9 (Landscape)

Supported Durations

5 seconds
10 seconds
15 seconds
20 seconds

Example

yaml

providers:
  - azure:video:sora

prompts:
  - 'A serene Japanese garden with koi fish swimming in a pond'

tests:
  - vars: {}
    assert:
      - type: is-video

Environment Variables

Variable	Description
`AZURE_API_KEY`	Azure API key
`AZURE_API_BASE_URL`	Resource endpoint URL
`AZURE_CLIENT_ID`	Entra ID client ID (for service principal auth)
`AZURE_CLIENT_SECRET`	Entra ID client secret (for service principal auth)
`AZURE_TENANT_ID`	Entra ID tenant ID (for service principal auth)

Azure OpenAI Provider

Azure

Setup

Option 1: API Key Authentication

Option 2: Client Credentials (Service Principal) Authentication {#service-principal}

Option 3: Azure CLI Authentication

Provider Types

Available Models

OpenAI Models

Third-Party Models (Azure AI Foundry)

Azure Responses API

Using the Responses API

Supported Responses Models

Responses API Features

Response Format with External Files

Advanced Configuration

Complete Responses API Example

Additional Responses API Configuration

Responses API Limitations

Environment Variables

Default Deployment

Embedding Models

Configuration

Using Client Credentials (Service Principal) {#using-client-credentials}

Prerequisites

Configuration

How It Works

Model-Graded Tests

Using Text and Embedding Providers for Different Assertion Types

Similarity

AI Services

Configuration Reference

General Configuration

Azure-Specific Configuration

OpenAI Configuration

Using Reasoning Models (o1, o3, o3-mini, o4-mini)

Using Variables with Reasoning Effort

Troubleshooting

Using Vision Models

Configuration

Image Input

Example

Using Claude Models

Option 1 (recommended): Anthropic Messages endpoint

Option 2: Azure OpenAI-compatible chat endpoint

Claude Configuration Example

Using Llama Models

Using DeepSeek Models

Assistants

Function Tools with Assistants

Using Vector Stores with Assistants

Simple Example

Azure AI Foundry Agents

Key Differences from Standard Azure Assistants

Setup

Basic Configuration

Configuration Options

Function Tools with Azure Foundry Agents

Agent-Defined Tools and Resources

Environment Variables

Complete Example

Error Handling

Caching

When to Use Azure Foundry Agents

Example Repository

Video Generation (Sora)

Prerequisites

Configuration

Supported Dimensions

Supported Durations

Example

Environment Variables

See Also