site/docs/providers/togetherai.md
Together AI provides access to open-source models through an API compatible with OpenAI's interface.
Together AI's API is compatible with OpenAI's API, which means all parameters available in the OpenAI provider work with Together AI.
Configure a Together AI model in your promptfoo configuration:
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.json
providers:
- id: togetherai:meta-llama/Llama-4-Scout-Instruct
config:
temperature: 0.7
The provider requires an API key stored in the TOGETHER_API_KEY environment variable.
config:
max_tokens: 4096
config:
tools:
- type: function
function:
name: get_weather
description: Get the current weather
parameters:
type: object
properties:
location:
type: string
description: City and state
config:
response_format: { type: 'json_object' }
Together AI offers over 200 models. Here are some of the most popular models by category:
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 (524,288 context length, FP8)meta-llama/Llama-4-Scout-17B-16E-Instruct (327,680 context length, FP16)deepseek-ai/DeepSeek-R1 (128,000 context length, FP8)deepseek-ai/DeepSeek-R1-Distill-Llama-70B (131,072 context length, FP16)deepseek-ai/DeepSeek-R1-Distill-Qwen-14B (131,072 context length, FP16)deepseek-ai/DeepSeek-V3 (16,384 context length, FP8)meta-llama/Llama-3.3-70B-Instruct-Turbo (131,072 context length, FP8)meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo (131,072 context length, FP8)meta-llama/Meta-Llama-3.1-405B-Instruct-Turbo (130,815 context length, FP8)meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo (131,072 context length, FP8)meta-llama/Llama-3.2-3B-Instruct-Turbo (131,072 context length, FP16)mistralai/Mixtral-8x7B-Instruct-v0.1 (32,768 context length, FP16)mistralai/Mixtral-8x22B-Instruct-v0.1 (65,536 context length, FP16)mistralai/Mistral-Small-24B-Instruct-2501 (32,768 context length, FP16)Qwen/Qwen2.5-72B-Instruct-Turbo (32,768 context length, FP8)Qwen/Qwen2.5-7B-Instruct-Turbo (32,768 context length, FP8)Qwen/Qwen2.5-Coder-32B-Instruct (32,768 context length, FP16)Qwen/QwQ-32B (32,768 context length, FP16)meta-llama/Llama-3.2-11B-Vision-Instruct-Turbo (131,072 context length, FP16)Qwen/Qwen2.5-VL-72B-Instruct (32,768 context length, FP8)Qwen/Qwen2-VL-72B-Instruct (32,768 context length, FP16)Together AI offers free tiers with reduced rate limits:
meta-llama/Llama-3.3-70B-Instruct-Turbo-Freemeta-llama/Llama-Vision-Freedeepseek-ai/DeepSeek-R1-Distill-Llama-70B-FreeFor a complete list of all 200+ available models and their specifications, refer to the Together AI Models page.
# yaml-language-server: $schema=https://promptfoo.dev/config-schema.jsons
providers:
- id: togetherai:meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8
config:
temperature: 0.7
max_tokens: 4096
- id: togetherai:deepseek-ai/DeepSeek-R1
config:
temperature: 0.0
response_format: { type: 'json_object' }
tools:
- type: function
function:
name: get_weather
description: Get weather information
parameters:
type: object
properties:
location: { type: 'string' }
unit: { type: 'string', enum: ['celsius', 'fahrenheit'] }
For more information, refer to the Together AI documentation.