Back to Promptfoo

xAI (Grok) Provider

site/docs/providers/xai.md

0.121.1030.0 KB
Original Source

xAI (Grok)

The xai provider supports xAI's Grok models through an API interface compatible with OpenAI's format, including text, vision, image generation, video generation, and voice workflows.

Setup

To use xAI's API, set the XAI_API_KEY environment variable or specify via apiKey in the configuration file.

sh
export XAI_API_KEY=your_api_key_here

When xAI is the selected fallback provider family, Promptfoo can use xAI defaults for grading, suggestions, synthesis, and web search. xAI does not currently expose a public embeddings or moderation API, so those defaults fall back to OpenAI when xAI is selected. Explicit provider IDs in your config still take precedence.

Supported Models

The xAI provider includes support for the following model formats. xAI's public model catalog currently recommends grok-4.3 for general chat and coding workloads; consult the catalog when choosing a new default for a long-lived integration.

Grok 4.3 Models

  • xai:grok-4.3 - General-purpose reasoning model
  • xai:grok-4.3-latest - Alias for the Grok 4.3 family

Grok 4.20 Models

  • xai:grok-4.20-0309-reasoning - Reasoning model
  • xai:grok-4.20 - Alias for the Grok 4.20 reasoning family
  • xai:grok-4.20-reasoning - Alias for the Grok 4.20 reasoning family
  • xai:grok-4.20-reasoning-latest - Alias for the Grok 4.20 reasoning family
  • xai:grok-4.20-0309-non-reasoning - Non-reasoning variant
  • xai:grok-4.20-non-reasoning - Alias for the Grok 4.20 non-reasoning family
  • xai:grok-4.20-non-reasoning-latest - Alias for the Grok 4.20 non-reasoning family
  • xai:grok-4.20-multi-agent-0309 - Multi-agent variant
  • xai:grok-4.20-multi-agent - Alias for the Grok 4.20 multi-agent family
  • xai:grok-4.20-multi-agent-latest - Alias for the Grok 4.20 multi-agent family

Grok 4.1 Fast Models

  • xai:grok-4-1-fast-reasoning - Frontier model optimized for agentic tool calling with reasoning (2M context)
  • xai:grok-4-1-fast-non-reasoning - Fast variant for instant responses without reasoning (2M context)
  • xai:grok-4-1-fast - Alias for grok-4-1-fast-reasoning
  • xai:grok-4-1-fast-reasoning-latest - Alias for grok-4-1-fast-reasoning
  • xai:grok-4-1-fast-non-reasoning-latest - Alias for grok-4-1-fast-non-reasoning

Grok Code Fast Models

  • xai:grok-code-fast-1 - Speedy and economical reasoning model optimized for agentic coding (256K context)
  • xai:grok-code-fast - Alias for grok-code-fast-1
  • xai:grok-code-fast-1-0825 - Specific version of the code-fast model (256K context)

Grok-4 Fast Models

  • xai:grok-4-fast-reasoning - Fast reasoning model with 2M context window
  • xai:grok-4-fast-non-reasoning - Fast non-reasoning model for instant responses (2M context)
  • xai:grok-4-fast - Alias for grok-4-fast-reasoning
  • xai:grok-4-fast-reasoning-latest - Alias for grok-4-fast-reasoning
  • xai:grok-4-fast-non-reasoning-latest - Alias for grok-4-fast-non-reasoning

Grok-4 Models

  • xai:grok-4-0709 - Flagship reasoning model (256K context)
  • xai:grok-4 - Alias for latest Grok-4 model
  • xai:grok-4-latest - Alias for latest Grok-4 model

Grok-3 Models

  • xai:grok-3-beta - Latest flagship model for enterprise tasks (131K context)
  • xai:grok-3-fast-beta - Fastest flagship model (131K context)
  • xai:grok-3-mini-beta - Smaller model for basic tasks, supports reasoning effort (32K context)
  • xai:grok-3-mini-fast-beta - Faster mini model, supports reasoning effort (32K context)
  • xai:grok-3 - Alias for grok-3-beta
  • xai:grok-3-latest - Alias for grok-3-beta
  • xai:grok-3-fast - Alias for grok-3-fast-beta
  • xai:grok-3-fast-latest - Alias for grok-3-fast-beta
  • xai:grok-3-mini - Alias for grok-3-mini-beta
  • xai:grok-3-mini-latest - Alias for grok-3-mini-beta
  • xai:grok-3-mini-fast - Alias for grok-3-mini-fast-beta
  • xai:grok-3-mini-fast-latest - Alias for grok-3-mini-fast-beta

Grok-2 and previous Models

  • xai:grok-2-latest - Latest Grok-2 model (131K context)
  • xai:grok-2-vision-latest - Latest Grok-2 vision model (32K context)
  • xai:grok-2-vision-1212
  • xai:grok-2-1212
  • xai:grok-beta - Beta version (131K context)
  • xai:grok-vision-beta - Vision beta version (8K context)

You can also use specific versioned models:

  • xai:grok-2-1212
  • xai:grok-2-vision-1212

Configuration

The provider supports all OpenAI provider configuration options plus Grok-specific options. Example usage:

yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:grok-3-mini-beta
    config:
      temperature: 0.7
      reasoning_effort: 'high' # Only for grok-3-mini models
      apiKey: your_api_key_here # Alternative to XAI_API_KEY

Reasoning Support

Multiple Grok models support reasoning capabilities:

Grok 4.3: General-purpose reasoning model recommended by xAI's public model catalog. It reasons automatically and does not support reasoning_effort.

Grok Code Fast Models: The grok-code-fast-1 family are reasoning models optimized for agentic coding workflows. They support:

  • Function calling and tool usage
  • Web search via search_parameters
  • Fast inference with built-in reasoning

Grok 4.3 Specific Behavior

Grok 4.3 is the best starting point for general text workflows:

  • Responses API recommended: Use xai:responses:grok-4.3 for server-side tools, multi-turn state, and newer xAI capabilities
  • Automatic reasoning: No reasoning_effort parameter is required or supported
  • Unsupported parameters: Same restrictions as other Grok 4-family reasoning models (presence_penalty, frequency_penalty, stop, reasoning_effort)
yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:grok-4.3
    config:
      temperature: 0.7
      max_completion_tokens: 4096

Grok-3 Models: The grok-3-mini-beta and grok-3-mini-fast-beta models support reasoning through the reasoning_effort parameter:

  • reasoning_effort: "low" - Minimal thinking time, using fewer tokens for quick responses
  • reasoning_effort: "high" - Maximum thinking time, leveraging more tokens for complex problems

:::info

For Grok-3, reasoning is only available for the mini variants. The standard grok-3-beta and grok-3-fast-beta models do not support reasoning.

:::

Grok 4.1 Fast Specific Behavior

Grok 4.1 Fast is xAI's frontier model specifically optimized for agentic tool calling:

  • Two variants: grok-4-1-fast-reasoning for maximum intelligence, grok-4-1-fast-non-reasoning for instant responses
  • Massive context window: 2,000,000 tokens for handling complex multi-turn agent interactions
  • Optimized for tool calling: Trained specifically for high-performance agentic tool calling via RL in simulated environments
  • Low latency and cost: $0.20/1M input tokens, $0.50/1M output tokens with fast inference
  • Unsupported parameters: Same restrictions as Grok-4 (no presence_penalty, frequency_penalty, stop, reasoning_effort)
yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:grok-4-1-fast-reasoning
    config:
      temperature: 0.7
      max_completion_tokens: 4096

Grok-4 Fast Specific Behavior

Grok-4 Fast models offer the same capabilities as Grok-4 but with faster inference and lower cost:

  • Two variants: grok-4-fast-reasoning for reasoning tasks, grok-4-fast-non-reasoning for instant responses
  • 2M context window: Same large context as Grok 4.1 Fast
  • Same parameter restrictions as Grok-4: No presence_penalty, frequency_penalty, stop, or reasoning_effort
yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:grok-4-fast-reasoning
    config:
      temperature: 0.7
      max_completion_tokens: 4096

Grok-4 Specific Behavior

Grok-4 introduces significant changes compared to previous Grok models:

  • Always uses reasoning: Grok-4 is a reasoning model that always operates at maximum reasoning capacity
  • No reasoning_effort parameter: Unlike Grok-3 mini models, Grok-4 does not support the reasoning_effort parameter
  • Unsupported parameters: The following parameters are not supported and will be automatically filtered out:
    • presencePenalty / presence_penalty
    • frequencyPenalty / frequency_penalty
    • stop
  • Larger context window: 256,000 tokens compared to 131,072 for Grok-3 models
  • Uses max_completion_tokens: As a reasoning model, Grok-4 uses max_completion_tokens instead of max_tokens
yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:grok-4
    config:
      temperature: 0.7
      max_completion_tokens: 4096

Grok Code Fast Specific Behavior

The Grok Code Fast models are optimized for agentic coding workflows and offer several key features:

  • Built for Speed: Designed to be highly responsive for agentic coding tools where multiple tool calls are common
  • Economical Pricing: At $0.20/1M input tokens and $1.50/1M output tokens, significantly more affordable than flagship models
  • Reasoning Capabilities: Built-in reasoning for code analysis, debugging, and problem-solving
  • Tool Integration: Excellent support for function calling, tool usage, and web search
  • Coding Expertise: Particularly adept at TypeScript, Python, Java, Rust, C++, and Go
yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:grok-code-fast-1
    # or use the alias:
    # - id: xai:grok-code-fast
    config:
      temperature: 0.1 # Lower temperature often preferred for coding tasks
      max_completion_tokens: 4096
      search_parameters:
        mode: auto # Enable web search for coding assistance

Region Support

You can specify a region to use a region-specific API endpoint:

yaml
providers:
  - id: xai:grok-4.3
    config:
      region: eu-west-1 # Will use https://eu-west-1.api.x.ai/v1

This is equivalent to setting base_url="https://eu-west-1.api.x.ai/v1" in the Python client. The same region option is also accepted by the xAI image, video, Responses, and realtime voice providers.

xAI's public regional docs say the global endpoint automatically routes requests and gives access to every model available to your team. The current public model catalog shows xAI's language, Grok Imagine image, and Grok Imagine video models in both us-east-1 and eu-west-1. Regional endpoints are useful for data-residency requirements, but xAI warns that not every model is guaranteed in every region over time; for the latest region-by-region availability, use the xAI Console or the model pages on xAI's site.

Live Search (Beta)

:::warning

xAI's current documentation recommends the Responses API for server-side tools. Promptfoo still passes legacy search_parameters through for older configs, but new search configs should use the Agent Tools API.

:::

Legacy configs can still pass a search_parameters object. The mode field controls how search is used:

  • off – Disable search
  • auto – Model decides when to search (default)
  • on – Always perform live search

Additional fields like sources, from_date, to_date, and return_citations may also be provided.

yaml
providers:
  - id: xai:grok-3-beta
    config:
      search_parameters:
        mode: auto
        return_citations: true
        sources:
          - type: web

For a full list of options see the xAI documentation.

Agent Tools API (Responses API)

Use the xai:responses:<model> provider to access xAI's Agent Tools API, which enables autonomous server-side tool execution for web search, X search, code execution, collections search, and remote MCP tools.

yaml
providers:
  - id: xai:responses:grok-4.3
    config:
      temperature: 0.7
      max_output_tokens: 4096
      tools:
        - type: web_search
        - type: x_search

Available Agent Tools

ToolDescription
web_searchSearch the web and browse pages
x_searchSearch X posts, users, and threads
code_execution / code_interpreterExecute Python code in a sandbox
collections_search / file_searchSearch uploaded knowledge bases
mcpConnect to remote MCP servers

Web Search Tool

yaml
tools:
  - type: web_search
    filters:
      allowed_domains:
        - example.com
        - news.com
      # OR excluded_domains (cannot use both)
    enable_image_understanding: true

X Search Tool

yaml
tools:
  - type: x_search
    from_date: '2025-01-01' # ISO8601 format
    to_date: '2025-11-27'
    allowed_x_handles:
      - elonmusk
    enable_image_understanding: true
    enable_video_understanding: true

Code Interpreter Tool

yaml
tools:
  - type: code_interpreter
    container:
      pip_packages:
        - numpy
        - pandas

Complete Example

yaml
providers:
  - id: xai:responses:grok-4.3
    config:
      temperature: 0.7
      tools:
        - type: web_search
          enable_image_understanding: true
        - type: x_search
          from_date: '2025-01-01'
        - type: code_interpreter
          container:
            pip_packages:
              - numpy
      tool_choice: auto # auto, required, or none
      parallel_tool_calls: true

tests:
  - vars:
      question: What's the latest AI news? Search the web and X.
    assert:
      - type: contains
        value: AI

Responses API Configuration

ParameterTypeDescription
temperaturenumberSampling temperature (0-2)
max_output_tokensnumberMaximum tokens to generate
max_tool_callsnumberMaximum tool calls for one request
top_pnumberNucleus sampling parameter
toolsarrayAgent tools to enable
tool_choicestringTool selection mode: auto, required, none
parallel_tool_callsbooleanAllow parallel tool execution
streambooleanRequest streamed response deltas
instructionsstringSystem-level instructions
previous_response_idstringFor multi-turn conversations
storebooleanStore response for later retrieval
includearrayAdditional response data to return
reasoningobjectMulti-agent configuration where supported
response_formatobjectJSON schema for structured output

Supported Models

The Responses API works with current Grok models, including:

  • grok-4.3
  • grok-4.20-reasoning
  • grok-4.20-non-reasoning
  • grok-4.20-multi-agent
  • grok-4-1-fast-reasoning (recommended for agentic workflows)
  • grok-4-1-fast-non-reasoning
  • grok-4-fast-reasoning
  • grok-4-fast-non-reasoning
  • grok-4

Migration from Live Search

If you're using Live Search via search_parameters, migrate to the Responses API:

Before (Live Search - deprecated):

yaml
providers:
  - id: xai:grok-4-1-fast-reasoning
    config:
      search_parameters:
        mode: auto
        sources:
          - type: web
          - type: x

After (Responses API):

yaml
providers:
  - id: xai:responses:grok-4.3
    config:
      tools:
        - type: web_search
        - type: x_search

Deferred Chat Completions

:::info Not Yet Supported

xAI offers Deferred Chat Completions for long-running requests that can be retrieved asynchronously via a request_id. This feature is not yet supported in promptfoo. For async workflows, use the xAI Python SDK directly.

:::

Function Calling

xAI supports standard OpenAI-compatible function calling for client-side tool execution:

yaml
providers:
  - id: xai:grok-4-1-fast-reasoning
    config:
      tools:
        - type: function
          function:
            name: get_weather
            description: Get the current weather for a location
            parameters:
              type: object
              properties:
                location:
                  type: string
                  description: City and state
              required:
                - location

Structured Outputs

xAI supports structured outputs via JSON schema:

yaml
providers:
  - id: xai:grok-4
    config:
      response_format:
        type: json_schema
        json_schema:
          name: analysis_result
          strict: true
          schema:
            type: object
            properties:
              summary:
                type: string
              confidence:
                type: number
            required:
              - summary
              - confidence
            additionalProperties: false

You can also load schemas from external files:

yaml
config:
  response_format: file://./schemas/analysis-schema.json

Nested file references and variable rendering are supported (see OpenAI documentation for details).

Vision Support

For models with vision capabilities, you can include images in your prompts using the same format as OpenAI. Create a prompt.yaml file:

yaml
- role: user
  content:
    - type: image_url
      image_url:
        url: '{{image_url}}'
        detail: 'high'
    - type: text
      text: '{{question}}'

Then reference it in your promptfoo config:

yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
prompts:
  - file://prompt.yaml

providers:
  - id: xai:grok-2-vision-latest

tests:
  - vars:
      image_url: 'https://example.com/image.jpg'
      question: "What's in this image?"

Embeddings

xAI does not currently expose a public embeddings API. Use the OpenAI provider (or another embedding provider) for similarity assertions.

Image Generation

xAI also supports image generation through Grok Imagine:

yaml
providers:
  - xai:image:grok-imagine-image

Current Grok Imagine image model IDs include:

  • xai:image:grok-imagine-image
  • xai:image:grok-imagine-image-pro

Example configuration for image generation:

yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
prompts:
  - 'A {{style}} painting of {{subject}}'

providers:
  - id: xai:image:grok-imagine-image
    config:
      n: 1 # Number of images to generate (1-10)
      response_format: 'url' # 'url' or 'b64_json'
      aspect_ratio: '16:9'
      resolution: '2k'

tests:
  - vars:
      style: 'impressionist'
      subject: 'sunset over mountains'

Image Editing

Use the same provider with image, images, or mask inputs to call xAI's image-editing endpoint:

yaml
providers:
  - id: xai:image:grok-imagine-image
    config:
      image:
        url: 'https://example.com/source.png'
      mask:
        url: 'https://example.com/mask.png'
      quality: 'high'

prompts:
  - 'Render this as a pencil sketch with detailed shading'

Video Generation

xAI supports video generation through the Grok Imagine API using the xai:video:grok-imagine-video provider:

yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
prompts:
  - 'Generate a video of: {{scene}}'

providers:
  - id: xai:video:grok-imagine-video
    config:
      duration: 5 # 1-15 seconds
      aspect_ratio: '16:9'
      resolution: '720p'

tests:
  - vars:
      scene: a cat playing with yarn
    assert:
      - type: cost
        threshold: 1.0

Configuration Options

OptionTypeDefaultDescription
durationnumber8Video length in seconds (1-15)
aspect_ratiostring16:9Aspect ratio: 16:9, 4:3, 1:1, 9:16, 3:4, 3:2, 2:3
resolutionstring720pOutput resolution: 720p, 480p
reference_imagesarray-Reference images for reference-to-video mode
poll_interval_msnumber10000Polling interval in milliseconds
max_poll_time_msnumber600000Maximum wait time (10 minutes)

Image-to-Video

Animate a static image by providing an image URL:

yaml
providers:
  - id: xai:video:grok-imagine-video
    config:
      image:
        url: 'https://example.com/image.jpg'
      duration: 5

Video Editing

Edit an existing video with text instructions:

yaml
providers:
  - id: xai:video:grok-imagine-video
    config:
      video:
        url: 'https://example.com/source-video.mp4'

prompts:
  - 'Make the colors more vibrant and add slow motion'

:::note Video editing skips duration, aspect ratio, and resolution validation since these are determined by the source video. :::

Reference-to-Video

Guide generation with up to seven reference images:

yaml
providers:
  - id: xai:video:grok-imagine-video
    config:
      reference_images:
        - url: 'https://example.com/person.jpg'
        - url: 'https://example.com/shirt.jpg'
      duration: 10

Reference-to-video requires a non-empty prompt, cannot be combined with image or video, and is limited to 10 seconds.

Pricing

Promptfoo uses the exact usage.cost_in_usd_ticks value returned by xAI when available, and falls back to the legacy estimate only when the API omits usage.

Voice Agent API

The xAI Voice Agent API enables real-time voice conversations with Grok models via WebSocket. Use the xai:voice:<model> provider format.

yaml
providers:
  - xai:voice:grok-voice-think-fast-1.0

Configuration

yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
  - id: xai:voice:grok-voice-think-fast-1.0
    config:
      voice: 'Ara' # Ara, Rex, Sal, Eve, or Leo
      instructions: 'You are a helpful voice assistant.'
      modalities: ['text', 'audio']
      turn_detection:
        type: server_vad
        threshold: 0.85
        silence_duration_ms: 500
        prefix_padding_ms: 333
      websocketTimeout: 60000 # Connection timeout in ms
      tools:
        - type: web_search
        - type: x_search

Available Voices

VoiceDescription
AraFemale voice
RexMale voice
SalMale voice
EveFemale voice
LeoMale voice

Turn Detection

Use turn_detection to tune server-side voice activity detection:

OptionTypeDescription
typestringserver_vad for automatic detection
thresholdnumberActivation threshold from 0.1 to 0.9
silence_duration_msnumberSilence required before ending the turn
prefix_padding_msnumberAudio kept before detected speech to avoid clipping

Built-in Tools

The Voice API includes server-side tools that execute automatically:

ToolDescription
web_searchSearch the web for information
x_searchSearch posts on X (Twitter)
file_searchSearch uploaded files in vector stores
yaml
tools:
  - type: web_search
  - type: x_search
    allowed_x_handles:
      - elonmusk
      - xai
  - type: file_search
    vector_store_ids:
      - vs-123
    max_num_results: 10

Custom Function Tools and Assertions

You can define custom function tools inline or load them from external files:

yaml
providers:
  - id: xai:voice:grok-voice-think-fast-1.0
    config:
      # Inline tool definition
      tools:
        - type: function
          name: set_volume
          description: Set the device volume level
          parameters:
            type: object
            properties:
              level:
                type: number
                description: Volume level from 0 to 100
            required:
              - level

      # Or load from external file (YAML or JSON)
      # tools: file://tools.yaml

tests:
  - vars:
      question: 'Set the volume to 50 percent'
    assert:
      # Check that the correct function was called with correct arguments
      - type: javascript
        value: |
          const calls = output.functionCalls || [];
          return calls.some(c => c.name === 'set_volume' && c.arguments?.level === 50);

      # Or use tool-call-f1 for function name matching
      - type: tool-call-f1
        value: ['set_volume']
        threshold: 1.0

External tools file example:

yaml
- type: function
  name: get_weather
  description: Get the current weather for a location
  parameters:
    type: object
    properties:
      location:
        type: string
    required:
      - location

- type: function
  name: set_reminder
  description: Set a reminder for the user
  parameters:
    type: object
    properties:
      message:
        type: string
      time:
        type: string
    required:
      - message
      - time

When function tools are used, the provider output includes a functionCalls array with:

  • name: The function name that was called
  • arguments: The parsed arguments object
  • result: The result returned by your function handler (if provided)

Custom Endpoint Configuration

You can configure a custom WebSocket endpoint for the Voice API, useful for proxies or regional endpoints:

yaml
providers:
  - id: xai:voice:grok-voice-think-fast-1.0
    config:
      # Option 1: Full base URL (transforms https:// to wss://)
      apiBaseUrl: 'https://my-proxy.example.com/v1'

      # Option 2: Host only (builds https://{host}/v1)
      # apiHost: 'my-proxy.example.com'

You can also use the XAI_API_BASE_URL environment variable:

sh
export XAI_API_BASE_URL=https://my-proxy.example.com/v1

URL transformation: The provider automatically converts HTTP URLs to WebSocket URLs (https://wss://, http://ws://) and appends /realtime to reach the Voice API endpoint.

Complete WebSocket URL Override

For advanced use cases like local testing, custom proxies, or endpoints requiring query parameters, you can provide a complete WebSocket URL that will be used exactly as specified without any transformation:

yaml
providers:
  - id: xai:voice:grok-voice-think-fast-1.0
    config:
      # Use this URL exactly as-is (no transformation applied)
      websocketUrl: 'wss://custom-endpoint.example.com/path?token=xyz&session=abc'

This is useful for:

  • Local development and testing with mock servers
  • Custom proxy configurations
  • Adding authentication tokens or session IDs as URL parameters
  • Using alternative WebSocket gateways or regional endpoints

Audio Configuration

Configure input/output audio formats:

yaml
config:
  audio:
    input:
      format:
        type: audio/pcm
        rate: 24000
    output:
      format:
        type: audio/pcm
        rate: 24000

Supported formats: audio/pcm, audio/pcmu, audio/pcma Supported sample rates: 8000, 16000, 22050, 24000, 32000, 44100, 48000 Hz

Complete Example

yaml
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
prompts:
  - file://input.json

providers:
  - id: xai:voice:grok-voice-think-fast-1.0
    config:
      voice: 'Ara'
      instructions: 'You are a helpful voice assistant.'
      modalities: ['text', 'audio']
      tools:
        - type: web_search

tests:
  - vars:
      question: 'What are the latest AI developments?'
    assert:
      - type: llm-rubric
        value: Provides information about recent AI news

Pricing

The Voice Agent API is billed at $0.05 per minute of connection time.

For more information on the available models and API usage, refer to the xAI documentation.

Examples

For examples demonstrating text generation, image creation, and web search, see the xai example.

bash
npx promptfoo@latest init --example xai/chat

For real-time voice conversations with Grok, see the xai-voice example.

bash
npx promptfoo@latest init --example xai/voice

See Also

Troubleshooting

502 Bad Gateway Errors

If you encounter 502 Bad Gateway errors when using the xAI provider, this typically indicates:

  • An invalid or missing API key
  • Server issues on x.ai's side

The xAI provider will provide helpful error messages to guide you in resolving these issues.

Solution: Verify your XAI_API_KEY environment variable is set correctly. You can obtain an API key from https://x.ai/.

Controlling Retries

If you're experiencing timeouts or want to control retry behavior:

  • To disable retries for 5XX errors: PROMPTFOO_RETRY_5XX=false
  • To reduce retry delays: PROMPTFOO_REQUEST_BACKOFF_MS=1000 (in milliseconds)

Reference