examples/xai/chat/README.md
This example demonstrates how to evaluate xAI's Grok models across their main capabilities: text generation with reasoning, image creation, and server-side search tools.
You can run this example with:
npx promptfoo@latest init --example xai/chat
cd xai/chat
This example requires the following environment variable:
XAI_API_KEY - Your xAI API key. You can obtain this from the xAI Console# Set your API key
export XAI_API_KEY=your_api_key_here
# Run the main evaluation
promptfoo eval
# View results in the web interface
promptfoo view
This example includes configurations to test different Grok capabilities:
promptfooconfig.yaml) - Mathematical reasoning with Grok 4.3, Grok 4.20, Grok 4.1 Fast, Grok 4 Fast, Grok 4, and Grok 3 modelspromptfooconfig.images.yaml) - Artistic image creation using Grok's image modelspromptfooconfig.search.yaml) - Real-time web and X search using the Responses APIpromptfooconfig.responses.yaml) - Autonomous web and X search using Agent Toolspromptfooconfig.promptfoo-search.yaml) - Responses API search with assertions example# Text generation with mathematical reasoning
promptfoo eval -c promptfooconfig.yaml
# Image generation with artistic styles
promptfoo eval -c promptfooconfig.images.yaml
# Search tools with web and X sources
promptfoo eval -c promptfooconfig.search.yaml
# Agent Tools with Responses API (recommended)
promptfoo eval -c promptfooconfig.responses.yaml
# Search demo with assertions
promptfoo eval -c promptfooconfig.promptfoo-search.yaml
The recommended starting point for general text workflows:
xai:grok-4.3 - General-purpose reasoning modelxai:responses:grok-4.3 - Recommended form for server-side toolsxai:grok-4.20-reasoning - Reasoning modelxai:grok-4.20-non-reasoning - Non-reasoning modelxai:grok-4.20-multi-agent - Multi-agent variantA frontier model optimized for agentic tool calling with a 2M context window:
xai:grok-4-1-fast-reasoning - Maximum intelligence with reasoningxai:grok-4-1-fast-non-reasoning - Fast responses without reasoningFast reasoning models with 2M context:
xai:grok-4-fast-reasoning - Reasoning variantxai:grok-4-fast-non-reasoning - Non-reasoning variantFlagship reasoning model:
xai:grok-4 - Full reasoning capabilitiesEnable autonomous tool execution via the Responses API:
providers:
- id: xai:responses:grok-4.3
config:
tools:
- type: web_search
- type: x_search
- type: code_interpreter
Enable real-time search via the Responses API:
providers:
- id: xai:responses:grok-4.3
config:
tools:
- type: web_search
- type: x_search