Cloudflare

The CloudflareVoice class in Mastra provides text-to-speech capabilities using Cloudflare Workers AI. This provider specializes in efficient, low-latency speech synthesis suitable for edge computing environments.

Usage example

typescript

import { CloudflareVoice } from '@mastra/voice-cloudflare'

// Initialize with configuration
const voice = new CloudflareVoice({
  speechModel: {
    name: '@cf/meta/m2m100-1.2b',
    apiKey: 'your-cloudflare-api-token',
    accountId: 'your-cloudflare-account-id',
  },
  speaker: 'en-US-1', // Default voice
})

// Convert text to speech
const audioStream = await voice.speak('Hello, how can I help you?', {
  speaker: 'en-US-2', // Override default voice
})

// Get available voices
const speakers = await voice.getSpeakers()
console.log(speakers)

Configuration

Constructor options

Methods

`speak()`

Converts text to speech using Cloudflare's text-to-speech service.

Returns: Promise<NodeJS.ReadableStream>

`getSpeakers()`

Returns an array of available voice options, where each node contains:

Notes

API tokens can be provided via constructor options or environment variables (CLOUDFLARE_API_TOKEN and CLOUDFLARE_ACCOUNT_ID)
Cloudflare Workers AI is optimized for edge computing with low latency
This provider only supports text-to-speech (TTS) functionality, not speech-to-text (STT)
The service integrates well with other Cloudflare Workers products
For production use, ensure your Cloudflare account has the appropriate Workers AI subscription
Voice options are more limited compared to some other providers, but performance at the edge is excellent

If you need speech-to-text capabilities in addition to text-to-speech, consider using one of these providers:

OpenAI - Provides both TTS and STT
Google - Provides both TTS and STT
Azure - Provides both TTS and STT

Reference: Cloudflare | Voice