content/providers/01-ai-sdk-providers/11-deepinfra.mdx
The DeepInfra provider contains support for state-of-the-art models through the DeepInfra API, including Llama 3, Mixtral, Qwen, and many other popular open-source models.
The DeepInfra provider is available via the @ai-sdk/deepinfra module. You can install it with:
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}> <Tab> <Snippet text="pnpm add @ai-sdk/deepinfra" dark /> </Tab> <Tab> <Snippet text="npm install @ai-sdk/deepinfra" dark /> </Tab> <Tab> <Snippet text="yarn add @ai-sdk/deepinfra" dark /> </Tab>
<Tab> <Snippet text="bun add @ai-sdk/deepinfra" dark /> </Tab> </Tabs>You can import the default provider instance deepinfra from @ai-sdk/deepinfra:
import { deepinfra } from '@ai-sdk/deepinfra';
If you need a customized setup, you can import createDeepInfra from @ai-sdk/deepinfra and create a provider instance with your settings:
import { createDeepInfra } from '@ai-sdk/deepinfra';
const deepinfra = createDeepInfra({
apiKey: process.env.DEEPINFRA_API_KEY ?? '',
});
You can use the following optional settings to customize the DeepInfra provider instance:
baseURL string
Use a different URL prefix for API calls, e.g. to use proxy servers.
The default prefix is https://api.deepinfra.com/v1.
Note: Language models and embeddings use OpenAI-compatible endpoints at {baseURL}/openai,
while image models use {baseURL}/inference.
apiKey string
API key that is being sent using the Authorization header. It defaults to
the DEEPINFRA_API_KEY environment variable.
headers Record<string,string>
Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation.
Defaults to the global fetch function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
You can create language models using a provider instance. The first argument is the model ID, for example:
import { deepinfra } from '@ai-sdk/deepinfra';
import { generateText } from 'ai';
const { text } = await generateText({
model: deepinfra('meta-llama/Meta-Llama-3.1-70B-Instruct'),
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
});
DeepInfra language models can also be used in the streamText function (see AI SDK Core).
| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
|---|---|---|---|---|
meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8 | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
meta-llama/Llama-4-Scout-17B-16E-Instruct | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
meta-llama/Llama-3.3-70B-Instruct-Turbo | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
meta-llama/Llama-3.3-70B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
meta-llama/Meta-Llama-3.1-405B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
meta-llama/Meta-Llama-3.1-70B-Instruct-Turbo | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
meta-llama/Meta-Llama-3.1-70B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
meta-llama/Meta-Llama-3.1-8B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
meta-llama/Llama-3.2-11B-Vision-Instruct | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
meta-llama/Llama-3.2-90B-Vision-Instruct | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
mistralai/Mixtral-8x7B-Instruct-v0.1 | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
deepseek-ai/DeepSeek-V3 | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
deepseek-ai/DeepSeek-R1 | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
deepseek-ai/DeepSeek-R1-Distill-Llama-70B | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
deepseek-ai/DeepSeek-R1-Turbo | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
nvidia/Llama-3.1-Nemotron-70B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Cross size={18} /> |
Qwen/Qwen2-7B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
Qwen/Qwen2.5-72B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
Qwen/Qwen2.5-Coder-32B-Instruct | <Cross size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
Qwen/QwQ-32B-Preview | <Cross size={18} /> | <Check size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
google/codegemma-7b-it | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
google/gemma-2-9b-it | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
microsoft/WizardLM-2-8x22B | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> | <Cross size={18} /> |
You can create DeepInfra image models using the .image() factory method.
For more on image generation with the AI SDK see generateImage().
import { deepinfra } from '@ai-sdk/deepinfra';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: deepinfra.image('stabilityai/sd3.5'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
});
You can pass model-specific parameters using the providerOptions.deepinfra field:
import { deepinfra } from '@ai-sdk/deepinfra';
import { generateImage } from 'ai';
const { image } = await generateImage({
model: deepinfra.image('stabilityai/sd3.5'),
prompt: 'A futuristic cityscape at sunset',
aspectRatio: '16:9',
providerOptions: {
deepinfra: {
num_inference_steps: 30, // Control the number of denoising steps (1-50)
},
},
});
DeepInfra supports image editing through models like Qwen/Qwen-Image-Edit. Pass input images via prompt.images to transform or edit existing images.
Transform an existing image using text prompts:
const imageBuffer = readFileSync('./input-image.png');
const { images } = await generateImage({
model: deepinfra.image('Qwen/Qwen-Image-Edit'),
prompt: {
text: 'Turn the cat into a golden retriever dog',
images: [imageBuffer],
},
size: '1024x1024',
});
Edit specific parts of an image using a mask. Transparent areas in the mask indicate where the image should be edited:
const image = readFileSync('./input-image.png');
const mask = readFileSync('./mask.png');
const { images } = await generateImage({
model: deepinfra.image('Qwen/Qwen-Image-Edit'),
prompt: {
text: 'A sunlit indoor lounge area with a pool containing a flamingo',
images: [image],
mask: mask,
},
});
Combine multiple reference images into a single output:
const cat = readFileSync('./cat.png');
const dog = readFileSync('./dog.png');
const { images } = await generateImage({
model: deepinfra.image('Qwen/Qwen-Image-Edit'),
prompt: {
text: 'Create a scene with both animals together, playing as friends',
images: [cat, dog],
},
});
For models supporting aspect ratios, the following ratios are typically supported:
1:1 (default), 16:9, 1:9, 3:2, 2:3, 4:5, 5:4, 9:16, 9:21
For models supporting size parameters, dimensions must typically be:
| Model | Dimensions Specification | Notes |
|---|---|---|
stabilityai/sd3.5 | Aspect Ratio | Premium quality base model, 8B parameters |
black-forest-labs/FLUX-1.1-pro | Size | Latest state-of-art model with superior prompt following |
black-forest-labs/FLUX-1-schnell | Size | Fast generation in 1-4 steps |
black-forest-labs/FLUX-1-dev | Size | Optimized for anatomical accuracy |
black-forest-labs/FLUX-pro | Size | Flagship Flux model |
black-forest-labs/FLUX.1-Kontext-dev | Size | Image editing and transformation model |
black-forest-labs/FLUX.1-Kontext-pro | Size | Professional image editing and transformation |
stabilityai/sd3.5-medium | Aspect Ratio | Balanced 2.5B parameter model |
stabilityai/sdxl-turbo | Aspect Ratio | Optimized for fast generation |
For more details and pricing information, see the DeepInfra text-to-image models page.
You can create DeepInfra embedding models using the .embeddingModel() factory method.
For more on embedding models with the AI SDK see embed().
import { deepinfra } from '@ai-sdk/deepinfra';
import { embed } from 'ai';
const { embedding } = await embed({
model: deepinfra.embeddingModel('BAAI/bge-large-en-v1.5'),
value: 'sunny day at the beach',
});
| Model | Dimensions | Max Tokens |
|---|---|---|
BAAI/bge-base-en-v1.5 | 768 | 512 |
BAAI/bge-large-en-v1.5 | 1024 | 512 |
BAAI/bge-m3 | 1024 | 8192 |
intfloat/e5-base-v2 | 768 | 512 |
intfloat/e5-large-v2 | 1024 | 512 |
intfloat/multilingual-e5-large | 1024 | 512 |
sentence-transformers/all-MiniLM-L12-v2 | 384 | 256 |
sentence-transformers/all-MiniLM-L6-v2 | 384 | 256 |
sentence-transformers/all-mpnet-base-v2 | 768 | 384 |
sentence-transformers/clip-ViT-B-32 | 512 | 77 |
sentence-transformers/clip-ViT-B-32-multilingual-v1 | 512 | 77 |
sentence-transformers/multi-qa-mpnet-base-dot-v1 | 768 | 512 |
sentence-transformers/paraphrase-MiniLM-L6-v2 | 384 | 128 |
shibing624/text2vec-base-chinese | 768 | 512 |
thenlper/gte-base | 768 | 512 |
thenlper/gte-large | 1024 | 512 |