Back to Eliza

Text-to-Speech Plugin

packages/docs/plugin-registry/tts.md

2.0.15.2 KB
Original Source

The Text-to-Speech (TTS) plugin enables Eliza agents to synthesize speech from text, providing voice responses through ElevenLabs, OpenAI TTS, or Microsoft Edge TTS.

Package: @elizaos/plugin-tts

Overview

The TTS plugin registers a TEXT_TO_SPEECH model handler and actions that allow agents to generate audio from text. Generated audio can be played in voice channels (Discord, Telegram voice), saved to files, or streamed to the client.

Installation

bash
eliza plugins install @elizaos/plugin-tts

Enable via Features

json
{
  "features": {
    "tts": true
  }
}

Configuration

Environment VariableRequiredDescription
TTS_AUTO_MODENoEnable automatic TTS for all responses
TTS_SUMMARIZENoSummarize long text before synthesis
TTS_MAX_LENGTHNoMaximum text length for synthesis
TTS_DEFAULT_VOICENoDefault voice profile
TTS_DEFAULT_PROVIDERNoDefault TTS provider (elevenlabs, openai, or edge-tts)

Providers

ElevenLabs

High-quality voice synthesis with voice cloning and emotion control.

Package: @elizaos/plugin-elevenlabs

Environment VariableRequiredDescription
ELEVENLABS_API_KEYYesElevenLabs API key from elevenlabs.io
ELEVENLABS_VOICE_IDNoVoice ID (default: Rachel)
ELEVENLABS_MODEL_IDNoModel ID (default: eleven_turbo_v2_5)
json
{
  "features": {
    "tts": {
      "enabled": true,
      "provider": "elevenlabs",
      "voiceId": "21m00Tcm4TlvDq8ikWAM",
      "modelId": "eleven_turbo_v2_5"
    }
  }
}

OpenAI TTS

json
{
  "features": {
    "tts": {
      "enabled": true,
      "provider": "openai",
      "voice": "alloy",
      "model": "tts-1"
    }
  }
}

Requires OPENAI_API_KEY.

Edge TTS (Free, No API Key)

Package: @elizaos/plugin-edge-tts

Microsoft Edge TTS is free and requires no API key. Synthesis is performed through Microsoft’s Edge TTS cloud (node-edge-tts talks to Microsoft’s service). Quality is lower than ElevenLabs but suitable for development.

Eliza default: When @elizaos/plugin-agent-orchestrator is loaded, Eliza automatically adds @elizaos/plugin-edge-tts so swarm / PTY paths that call TEXT_TO_SPEECH have a handler. That means a default install with the orchestrator can make outbound calls to Microsoft whenever those code paths run TTS—even if you never enabled “TTS” in features.

Opt out of auto-load: set ELIZA_DISABLE_EDGE_TTS=1 in the environment or ~/.eliza/.env, or disable the plugin entry: plugins.entries["edge-tts"].enabled: false. See Environment variables (ELIZA_DISABLE_EDGE_TTS).

json
{
  "features": {
    "tts": {
      "enabled": true,
      "provider": "edge-tts",
      "voice": "en-US-AriaNeural"
    }
  }
}

ElevenLabs Voice Options

Voice IDNameDescription
21m00Tcm4TlvDq8ikWAMRachelCalm, professional female
AZnzlk1XvdvUeBnXmlldDomiStrong female
EXAVITQu4vr4xnSDxMaLBellaSoft female
ErXwobaYiN019PkySvjVAntoniWell-rounded male
MF3mGyEYCl7XYWbV9V6OElliEmotional female
TxGEqnHWrfWFTfGW9XjXJoshDeep male

Browse all voices at elevenlabs.io/voice-library.

ElevenLabs Models

Model IDDescription
eleven_turbo_v2_5Fastest, lowest latency
eleven_turbo_v2Fast, good quality
eleven_multilingual_v2Multilingual support
eleven_monolingual_v1English only, high quality

OpenAI TTS Options

Voices

VoiceDescription
alloyNeutral
echoMale
fableBritish male
onyxDeep male
novaFemale
shimmerSoft female

Models

ModelDescription
tts-1Faster, lower latency
tts-1-hdHigher quality

Actions

ActionDescription
SPEAKConvert text to speech and play/return audio
GENERATE_MEDIAGenerate an audio file from text using mediaType: "audio"
SET_VOICEChange the active voice

Usage Examples

After the plugin is loaded:

"Read this article to me"

"Say the following in a cheerful voice: Welcome to Eliza!"

"Generate an audio file from this text"

Voice Channel Integration

When combined with Discord or Telegram connectors, the TTS plugin enables voice channel support:

  • Discord: Agent joins voice channels and speaks responses
  • Telegram: Agent sends voice messages as .ogg files

Output Formats

FormatUse Case
mp3Streaming, Discord, general
ogg_vorbisTelegram voice messages
pcmLow-latency streaming
wavArchival, high quality