docs/src/content/en/reference/voice/voice.send.mdx
The send() method streams audio data in real-time to voice providers for continuous processing. This method is essential for real-time speech-to-speech conversations, allowing you to send microphone input directly to the AI service.
import { OpenAIRealtimeVoice } from '@mastra/voice-openai-realtime'
import Speaker from '@mastra/node-speaker'
import { getMicrophoneStream } from '@mastra/node-audio'
const speaker = new Speaker({
sampleRate: 24100, // Audio sample rate in Hz - standard for high-quality audio on MacBook Pro
channels: 1, // Mono audio output (as opposed to stereo which would be 2)
bitDepth: 16, // Bit depth for audio quality - CD quality standard (16-bit resolution)
})
// Initialize a real-time voice provider
const voice = new OpenAIRealtimeVoice({
realtimeConfig: {
model: 'gpt-5.1-realtime',
apiKey: process.env.OPENAI_API_KEY,
},
})
// Connect to the real-time service
await voice.connect()
// Set up event listeners for responses
voice.on('writing', ({ text, role }) => {
console.log(`${role}: ${text}`)
})
voice.on('speaker', stream => {
stream.pipe(speaker)
})
// Get microphone stream (implementation depends on your environment)
const microphoneStream = getMicrophoneStream()
// Send audio data to the voice provider
await voice.send(microphoneStream)
// You can also send audio data as Int16Array
const audioBuffer = getAudioBuffer() // Assume this returns Int16Array
await voice.send(audioBuffer)
<PropertiesTable content={[ { name: 'audioData', type: 'NodeJS.ReadableStream | Int16Array', description: 'Audio data to send to the voice provider. Can be a readable stream (like a microphone stream) or an Int16Array of audio samples.', isOptional: false, }, ]} />
Returns a Promise<void> that resolves when the audio data has been accepted by the voice provider.
connect() before using send() to establish the WebSocket connectionsend() to transmit user audio, then answer() to trigger the AI response