content/providers/01-ai-sdk-providers/120-gladia.mdx
The Gladia provider contains language model support for the Gladia transcription API.
The Gladia provider is available in the @ai-sdk/gladia module. You can install it with
<Tabs items={['pnpm', 'npm', 'yarn', 'bun']}> <Tab> <Snippet text="pnpm add @ai-sdk/gladia" dark /> </Tab> <Tab> <Snippet text="npm install @ai-sdk/gladia" dark /> </Tab> <Tab> <Snippet text="yarn add @ai-sdk/gladia" dark /> </Tab>
<Tab> <Snippet text="bun add @ai-sdk/gladia" dark /> </Tab> </Tabs>You can import the default provider instance gladia from @ai-sdk/gladia:
import { gladia } from '@ai-sdk/gladia';
If you need a customized setup, you can import createGladia from @ai-sdk/gladia and create a provider instance with your settings:
import { createGladia } from '@ai-sdk/gladia';
const gladia = createGladia({
// custom settings, e.g.
fetch: customFetch,
});
You can use the following optional settings to customize the Gladia provider instance:
apiKey string
API key that is being sent using the Authorization header.
It defaults to the GLADIA_API_KEY environment variable.
headers Record<string,string>
Custom headers to include in the requests.
fetch (input: RequestInfo, init?: RequestInit) => Promise<Response>
Custom fetch implementation.
Defaults to the global fetch function.
You can use it as a middleware to intercept requests,
or to provide a custom fetch implementation for e.g. testing.
You can create models that call the Gladia transcription API
using the .transcription() factory method.
const model = gladia.transcription();
You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the summarize option will enable summaries for sections of content.
import { experimental_transcribe as transcribe } from 'ai';
import { gladia } from '@ai-sdk/gladia';
import { type GladiaTranscriptionModelOptions } from '@ai-sdk/gladia';
import { readFile } from 'fs/promises';
const result = await transcribe({
model: gladia.transcription(),
audio: await readFile('audio.mp3'),
providerOptions: {
gladia: {
summarization: true,
} satisfies GladiaTranscriptionModelOptions,
},
});
The following provider options are available:
contextPrompt string
Context to feed the transcription model with for possible better accuracy. Optional.
customVocabulary boolean | any[]
Custom vocabulary to improve transcription accuracy. Optional.
customVocabularyConfig object
Configuration for custom vocabulary. Optional.
detectLanguage boolean
Whether to automatically detect the language. Optional.
enableCodeSwitching boolean
Enable code switching for multilingual audio. Optional.
codeSwitchingConfig object
Configuration for code switching. Optional.
language string
Specify the language of the audio. Optional.
callback boolean
Enable callback when transcription is complete. Optional.
callbackConfig object
Configuration for callback. Optional.
subtitles boolean
Generate subtitles from the transcription. Optional.
subtitlesConfig object
Configuration for subtitles. Optional.
diarization boolean
Enable speaker diarization. Optional.
diarizationConfig object
Configuration for diarization. Optional.
translation boolean
Enable translation of the transcription. Optional.
translationConfig object
Configuration for translation. Optional.
summarization boolean
Enable summarization of the transcription. Optional.
summarizationConfig object
Configuration for summarization. Optional.
moderation boolean
Enable content moderation. Optional.
namedEntityRecognition boolean
Enable named entity recognition. Optional.
chapterization boolean
Enable chapterization of the transcription. Optional.
nameConsistency boolean
Enable name consistency in the transcription. Optional.
customSpelling boolean
Enable custom spelling. Optional.
customSpellingConfig object
Configuration for custom spelling. Optional.
structuredDataExtraction boolean
Enable structured data extraction. Optional.
structuredDataExtractionConfig object
Configuration for structured data extraction. Optional.
sentimentAnalysis boolean
Enable sentiment analysis. Optional.
audioToLlm boolean
Enable audio to LLM processing. Optional.
audioToLlmConfig object
Configuration for audio to LLM. Optional.
customMetadata Record<string, any>
Custom metadata to include with the request. Optional.
sentences boolean
Enable sentence detection. Optional.
displayMode boolean
Enable display mode. Optional.
punctuationEnhanced boolean
Enable enhanced punctuation. Optional.
| Model | Transcription | Duration | Segments | Language |
|---|---|---|---|---|
Default | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |