packages/voice/README.md
Audio transcription providers for CopilotKit.
pnpm add @copilotkit/voice openai
import { CopilotRuntime, createCopilotEndpoint } from "@copilotkit/runtime";
import { TranscriptionServiceOpenAI } from "@copilotkit/voice";
import OpenAI from "openai";
const runtime = new CopilotRuntime({
agents: { default: yourAgent },
transcriptionService: new TranscriptionServiceOpenAI({
openai: new OpenAI({ apiKey: process.env.OPENAI_API_KEY }),
}),
});
Once configured, the chat UI shows a microphone button. Users can record audio, which gets transcribed and inserted into the input field as text.
Uses OpenAI Whisper for transcription.
new TranscriptionServiceOpenAI({
openai: new OpenAI({ apiKey: "..." }), // required
model: "whisper-1", // default
language: "en", // optional, ISO-639-1 code
prompt: "Technical discussion context", // optional, helps with domain terms
temperature: 0, // optional, 0 = deterministic
});
Extend TranscriptionService from runtime:
import {
TranscriptionService,
TranscribeFileOptions,
} from "@copilotkit/runtime";
class MyTranscriptionService extends TranscriptionService {
async transcribeFile(options: TranscribeFileOptions): Promise<string> {
// options.audioFile, options.mimeType, options.size
return "transcribed text";
}
}