Back to Mastra

Reference: Cloudflare | Voice

docs/src/content/en/reference/voice/cloudflare.mdx

2025-12-184.2 KB
Original Source

Cloudflare

The CloudflareVoice class in Mastra provides text-to-speech capabilities using Cloudflare Workers AI. This provider specializes in efficient, low-latency speech synthesis suitable for edge computing environments.

Usage example

typescript
import { CloudflareVoice } from '@mastra/voice-cloudflare'

// Initialize with configuration
const voice = new CloudflareVoice({
  speechModel: {
    name: '@cf/meta/m2m100-1.2b',
    apiKey: 'your-cloudflare-api-token',
    accountId: 'your-cloudflare-account-id',
  },
  speaker: 'en-US-1', // Default voice
})

// Convert text to speech
const audioStream = await voice.speak('Hello, how can I help you?', {
  speaker: 'en-US-2', // Override default voice
})

// Get available voices
const speakers = await voice.getSpeakers()
console.log(speakers)

Configuration

Constructor options

<PropertiesTable content={[ { name: 'speechModel', type: 'CloudflareSpeechConfig', description: 'Configuration for text-to-speech synthesis.', isOptional: true, properties: [ { type: 'CloudflareSpeechConfig', parameters: [ { name: 'name', type: 'string', description: 'Model name to use for TTS.', isOptional: true, defaultValue: "'@cf/meta/m2m100-1.2b'", }, { name: 'apiKey', type: 'string', description: 'Cloudflare API token with Workers AI access. Falls back to CLOUDFLARE_API_TOKEN environment variable.', isOptional: true, }, { name: 'accountId', type: 'string', description: 'Cloudflare account ID. Falls back to CLOUDFLARE_ACCOUNT_ID environment variable.', isOptional: true, }, ], }, ], }, { name: 'speaker', type: 'string', description: 'Default voice ID for speech synthesis.', isOptional: true, defaultValue: "'en-US-1'", }, ]} />

Methods

speak()

Converts text to speech using Cloudflare's text-to-speech service.

<PropertiesTable content={[ { name: 'input', type: 'string | NodeJS.ReadableStream', description: 'Text or text stream to convert to speech.', isOptional: false, }, { name: 'options', type: 'Options', description: 'Configuration options.', isOptional: true, properties: [ { type: 'Options', parameters: [ { name: 'speaker', type: 'string', description: 'Voice ID to use for speech synthesis.', isOptional: true, defaultValue: "Constructor's speaker value", }, { name: 'format', type: 'string', description: 'Output audio format.', isOptional: true, defaultValue: "'mp3'", }, ], }, ], }, ]} />

Returns: Promise<NodeJS.ReadableStream>

getSpeakers()

Returns an array of available voice options, where each node contains:

<PropertiesTable content={[ { name: 'voiceId', type: 'string', description: "Unique identifier for the voice (e.g., 'en-US-1')", isOptional: false, }, { name: 'language', type: 'string', description: "Language code of the voice (e.g., 'en-US')", isOptional: false, }, ]} />

Notes

  • API tokens can be provided via constructor options or environment variables (CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID)
  • Cloudflare Workers AI is optimized for edge computing with low latency
  • This provider only supports text-to-speech (TTS) functionality, not speech-to-text (STT)
  • The service integrates well with other Cloudflare Workers products
  • For production use, ensure your Cloudflare account has the appropriate Workers AI subscription
  • Voice options are more limited compared to some other providers, but performance at the edge is excellent

If you need speech-to-text capabilities in addition to text-to-speech, consider using one of these providers:

  • OpenAI - Provides both TTS and STT
  • Google - Provides both TTS and STT
  • Azure - Provides both TTS and STT