content/providers/01-ai-sdk-providers/10-fal.mdx
Fal AI provides a generative media platform for developers with lightning-fast inference capabilities. Their platform offers optimized performance for running diffusion models, with speeds up to 4x faster than alternatives.
The Fal provider is available via the @ai-sdk/fal module. You can install it with
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}> <Tab> <Snippet text="pnpm add @ai-sdk/fal" dark /> </Tab> <Tab> <Snippet text="npm install @ai-sdk/fal" dark /> </Tab> <Tab> <Snippet text="yarn add @ai-sdk/fal" dark /> </Tab>
<Tab> <Snippet text="bun add @ai-sdk/fal" dark /> </Tab> </Tabs>You can import the default provider instance fal from @ai-sdk/fal:
import { fal } from '@ai-sdk/fal';
If you need a customized setup, you can import createFal and create a provider instance with your settings:
import { createFal } from '@ai-sdk/fal';
const fal = createFal({
apiKey: 'your-api-key', // optional, defaults to FAL_API_KEY environment variable, falling back to FAL_KEY
baseURL: 'custom-url', // optional
headers: {
/* custom headers */
}, // optional
});
You can use the following optional settings to customize the Fal provider instance:
baseURL string
Use a different URL prefix for API calls, e.g. to use proxy servers.
The default prefix is https://fal.run.
apiKey string
API key that is being sent using the Authorization header.
It defaults to the FAL_API_KEY environment variable, falling back to FAL_KEY.
headers Record<string,string>
Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.
You can create Fal image models using the .image() factory method.
For more on image generation with the AI SDK see generateImage().
import { fal } from '@ai-sdk/fal';
import { generateImage } from 'ai';
import fs from 'fs';
const { image, providerMetadata } = await generateImage({
model: fal.image('fal-ai/flux/dev'),
prompt: 'A serene mountain landscape at sunset',
});
const filename = `image-${Date.now()}.png`;
fs.writeFileSync(filename, image.uint8Array);
console.log(`Image saved to ${filename}`);
Fal image models may return additional information for the images and the request.
Here are some examples of properties that may be set for each image
providerMetadata.fal.images[0].nsfw; // boolean, image is not safe for work
providerMetadata.fal.images[0].width; // number, image width
providerMetadata.fal.images[0].height; // number, image height
providerMetadata.fal.images[0].contentType; // string, mime type of the image
Fal offers many models optimized for different use cases. Here are a few popular examples. For a full list of models, see the Fal AI Search Page.
| Model | Description |
|---|---|
fal-ai/flux/dev | FLUX.1 [dev] model for high-quality image generation |
fal-ai/flux-pro/kontext | FLUX.1 Kontext [pro] handles both text and reference images as inputs, enabling targeted edits and complex transformations |
fal-ai/flux-pro/kontext/max | FLUX.1 Kontext [max] with improved prompt adherence and typography generation |
fal-ai/flux-lora | Super fast endpoint for FLUX.1 with LoRA support |
fal-ai/ideogram/character | Generate consistent character appearances across multiple images. Maintain facial features, proportions, and distinctive traits |
fal-ai/qwen-image | Qwen-Image foundation model with significant advances in complex text rendering and precise image editing |
fal-ai/omnigen-v2 | Unified image generation model for Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more |
fal-ai/bytedance/dreamina/v3.1/text-to-image | Dreamina showcases superior picture effects with improvements in aesthetics, precise and diverse styles, and rich details |
fal-ai/recraft/v3/text-to-image | SOTA in image generation with vector art and brand style capabilities |
fal-ai/wan/v2.2-a14b/text-to-image | High-resolution, photorealistic images with fine-grained detail |
Fal models support the following aspect ratios:
Key features of Fal models include:
Transform existing images using text prompts.
await generateImage({
model: fal.image('fal-ai/flux-pro/kontext/max'),
prompt: {
text: 'Put a donut next to the flour.',
images: [
'https://v3.fal.media/files/rabbit/rmgBxhwGYb2d3pl3x9sKf_output.png',
],
},
});
Images can also be passed as base64-encoded string, a Uint8Array, an ArrayBuffer, or a Buffer.
A mask can be passed as well
await generateImage({
model: fal.image('fal-ai/flux-pro/kontext/max'),
prompt: {
text: 'Put a donut next to the flour.',
images: [imageBuffer],
mask: maskBuffer,
},
});
Fal image models support flexible provider options through the providerOptions.fal object. You can pass any parameters supported by the specific Fal model's API. Common options include:
prompt.images instead)image_urls array for models that support multiple images (e.g., fal-ai/flux-2/edit)Refer to the Fal AI model documentation for model-specific parameters.
Fal's platform offers several advanced capabilities:
For more details about Fal's capabilities and features, visit the Fal AI documentation.
You can create models that call the Fal transcription API
using the .transcription() factory method.
The first argument is the model id without the fal-ai/ prefix e.g. wizper.
const model = fal.transcription('wizper');
You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the batchSize option will increase the number of audio chunks processed in parallel.
import { experimental_transcribe as transcribe } from 'ai';
import { fal, type FalTranscriptionModelOptions } from '@ai-sdk/fal';
import { readFile } from 'fs/promises';
const result = await transcribe({
model: fal.transcription('wizper'),
audio: await readFile('audio.mp3'),
providerOptions: {
fal: { batchSize: 10 } satisfies FalTranscriptionModelOptions,
},
});
The following provider options are available:
language string Language of the audio file. Defaults to 'en'. If set to null, the language will be automatically detected. Accepts ISO language codes like 'en', 'fr', 'zh', etc. Optional.
diarize boolean Whether to diarize the audio file (identify different speakers). Defaults to true. Optional.
chunkLevel string Level of the chunks to return. Either 'segment' or 'word'. Default value: "segment" Optional.
version string Version of the model to use. All models are Whisper large variants. Default value: "3" Optional.
batchSize number Batch size for processing. Default value: 64 Optional.
numSpeakers number Number of speakers in the audio file. If not provided, the number of speakers will be automatically detected. Optional.
| Model | Transcription | Duration | Segments | Language |
|---|---|---|---|---|
whisper | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
wizper | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
You can create models that call Fal text-to-speech endpoints using the .speech() factory method.
import { experimental_generateSpeech as generateSpeech } from 'ai';
import { fal } from '@ai-sdk/fal';
const result = await generateSpeech({
model: fal.speech('fal-ai/minimax/speech-02-hd'),
text: 'Hello from the AI SDK!',
});
| Model | Description |
|---|---|
fal-ai/minimax/voice-clone | Clone a voice from a sample audio and generate speech from text prompts |
fal-ai/minimax/voice-design | Design a personalized voice from a text description and generate speech from text prompts |
fal-ai/dia-tts/voice-clone | Clone dialog voices from a sample audio and generate dialogs from text prompts |
fal-ai/minimax/speech-02-hd | Generate speech from text prompts and different voices |
fal-ai/minimax/speech-02-turbo | Generate fast speech from text prompts and different voices |
fal-ai/dia-tts | Directly generates realistic dialogue from transcripts with audio conditioning for emotion control. Produces natural nonverbals like laughter and throat clearing |
resemble-ai/chatterboxhd/text-to-speech | Generate expressive, natural speech with Resemble AI's Chatterbox. Features unique emotion control, instant voice cloning from short audio, and built-in watermarking |
Pass provider-specific options via providerOptions.fal depending on the model:
voice_setting object
voice_id (string): predefined voice IDspeed (number): 0.5–2.0vol (number): 0–10pitch (number): -12–12emotion (enum): happy | sad | angry | fearful | disgusted | surprised | neutralenglish_normalization (boolean)audio_setting object Audio configuration settings specific to the model.
language_boost enum Chinese | Chinese,Yue | English | Arabic | Russian | Spanish | French | Portuguese | German | Turkish | Dutch | Ukrainian | Vietnamese | Indonesian | Japanese | Italian | Korean | Thai | Polish | Romanian | Greek | Czech | Finnish | Hindi | auto
pronunciation_dict object Custom pronunciation dictionary for specific words.
Model-specific parameters (e.g., audio_url, prompt, preview_text, ref_audio_url, ref_text) can be passed directly under providerOptions.fal and will be forwarded to the Fal API.