site/docs/providers/aimlapi.md
AI/ML API provides access to 300+ AI models through a unified OpenAI-compatible interface, including state-of-the-art models from OpenAI, Anthropic, Google, Meta, and more.
AI/ML API's endpoints are compatible with OpenAI's API, which means all parameters available in the OpenAI provider work with AI/ML API.
To use AI/ML API, you need to set the AIML_API_KEY environment variable or specify the apiKey in the provider configuration.
Example of setting the environment variable:
export AIML_API_KEY=your_api_key_here
Get your API key at aimlapi.com.
aimlapi:chat:<model_name>
aimlapi:completion:<model_name>
aimlapi:embedding:<model_name>
You can omit the type to default to chat mode:
aimlapi:<model_name>
Configure the provider in your promptfoo configuration file:
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: aimlapi:chat:deepseek-r1
config:
temperature: 0.7
max_tokens: 2000
apiKey: ... # optional, overrides environment variable
All standard OpenAI parameters are supported:
| Parameter | Description |
|---|---|
apiKey | Your AI/ML API key |
temperature | Controls randomness (0.0 to 2.0) |
max_tokens | Maximum number of tokens to generate |
top_p | Nucleus sampling parameter |
frequency_penalty | Penalizes frequent tokens |
presence_penalty | Penalizes new tokens based on presence |
stop | Sequences where the API will stop generating |
stream | Enable streaming responses |
AI/ML API offers models from multiple providers. Here are some of the most popular models by category:
deepseek-r1 - Advanced reasoning with chain-of-thought capabilitiesopenai/o3-mini - Efficient reasoning modelopenai/o4-mini - Latest compact reasoning modelqwen/qwq-32b - Alibaba's reasoning modelopenai/gpt-5 - Latest GPT with 1M token contextgpt-5-mini - 83% cheaper than GPT-4o with comparable performanceanthropic/claude-4-sonnet - Balanced speed and capabilityanthropic/claude-4-opus - Claude 4 Opus modelgoogle/gemini-2.5-pro-preview - Google's versatile multimodal modelgoogle/gemini-2.5-flash - Ultra-fast streaming responsesx-ai/grok-3-beta - xAI's most advanced modeldeepseek-v3 - Powerful open-source alternativemeta-llama/llama-4-maverick - Latest Llama modelqwen/qwen-max-2025-01-25 - Alibaba's efficient MoE modelmistral/codestral-2501 - Specialized for codingtext-embedding-3-large - OpenAI's latest embedding modelvoyage-large-2 - High-quality embeddingsbge-m3 - Multilingual embeddingsFor a complete list of all 300+ available models, visit the AI/ML API Models page.
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- aimlapi:chat:deepseek-r1
- aimlapi:chat:gpt-5-mini
- aimlapi:chat:claude-4-sonnet
prompts:
- 'Explain {{concept}} in simple terms'
tests:
- vars:
concept: 'quantum computing'
assert:
- type: contains
value: 'qubit'
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
# Reasoning model with low temperature
- id: aimlapi:chat:deepseek-r1
label: 'DeepSeek R1 (Reasoning)'
config:
temperature: 0.1
max_tokens: 4000
# General purpose model
- id: aimlapi:chat:openai/gpt-5
label: 'GPT-4.1'
config:
temperature: 0.7
max_tokens: 2000
# Fast, cost-effective model
- id: aimlapi:chat:gemini-2.5-flash
label: 'Gemini 2.5 Flash'
config:
temperature: 0.5
stream: true
prompts:
- file://prompts/coding_task.txt
tests:
- vars:
task: 'implement a binary search tree in Python'
assert:
- type: python
value: |
# Verify the code is valid Python
import ast
try:
ast.parse(output)
return True
except:
return False
- type: llm-rubric
value: 'The code should include insert, search, and delete methods'
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: aimlapi:embedding:text-embedding-3-large
config:
dimensions: 3072 # Optional: reduce embedding dimensions
prompts:
- '{{text}}'
tests:
- vars:
text: 'The quick brown fox jumps over the lazy dog'
assert:
- type: is-valid-embedding
- type: embedding-dimension
value: 3072
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: aimlapi:chat:gpt-5
config:
response_format: { type: 'json_object' }
temperature: 0.0
prompts:
- |
Extract the following information from the text and return as JSON:
- name
- age
- occupation
Text: {{text}}
tests:
- vars:
text: 'John Smith is a 35-year-old software engineer'
assert:
- type: is-json
- type: javascript
value: |
const data = JSON.parse(output);
return data.name === 'John Smith' &&
data.age === 35 &&
data.occupation === 'software engineer';
Test your setup with working examples:
npx promptfoo@latest init --example provider-aiml-api
This includes tested configurations for comparing multiple models, evaluating reasoning capabilities, and measuring response quality.
For detailed pricing information, visit aimlapi.com/pricing.