Back to Eliza

Voice Cloning

packages/cloud-frontend/content/voice-cloning.mdx

2.0.16.5 KB
Original Source

import { Callout, Steps, Tabs, Cards } from "@/docs/components";

Voice Cloning

Create custom AI voices and generate speech with ElevenLabs integration.

<div className="status-badge status-beta">Beta</div>

Overview

Voice cloning on elizaOS Cloud enables you to:

  • Clone voices: Create AI replicas of any voice
  • Generate speech: Convert text to natural-sounding audio
  • Custom voices: Use cloned voices in your agents
  • Multi-language: Support for 29+ languages

Quick Start

Dashboard

Navigate to Dashboard → Voices for the visual interface.

API

<Tabs items={['Clone Voice', 'Generate Speech', 'List Voices']}> <Tabs.Tab>

bash
# Clone a voice from audio samples
curl -X POST "https://elizacloud.ai/api/elevenlabs/voices/clone" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "name=My Voice Clone" \
  -F "[email protected]" \
  -F "[email protected]"

</Tabs.Tab> <Tabs.Tab>

bash
# Generate speech
curl -X POST "https://elizacloud.ai/api/elevenlabs/tts" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hello, this is a test of voice synthesis.",
    "voice_id": "voice_abc123",
    "model_id": "eleven_multilingual_v2"
  }' \
  --output speech.mp3

</Tabs.Tab> <Tabs.Tab>

bash
# List your voices
curl -X GET "https://elizacloud.ai/api/elevenlabs/voices/user" \
  -H "Authorization: Bearer YOUR_API_KEY"

</Tabs.Tab> </Tabs>

Voice Cloning

<Steps> ### Prepare Audio Samples Gather 1-3 minutes of clean audio from the target voice.

Requirements:

  • Clear speech without background noise
  • Single speaker only
  • High quality (WAV or MP3, 44.1kHz+)

Upload Samples

Upload audio files via dashboard or API.

Create Clone

Submit the cloning request and wait for processing.

Verify Quality

Test the cloned voice with sample text.

</Steps>

Sample Requirements

RequirementRecommendation
Duration1-3 minutes total
FormatWAV, MP3, M4A
Quality44.1kHz, 16-bit minimum
ContentNatural speech, varied intonation
NoiseMinimal background noise
<Callout type="warning"> Using someone's voice without permission may violate their rights. Only clone voices you have rights to use. </Callout>

Text-to-Speech

Generate Speech

javascript
const response = await fetch("https://elizacloud.ai/api/elevenlabs/tts", {
  method: "POST",
  headers: {
    Authorization: "Bearer YOUR_API_KEY",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    text: "Welcome to elizaOS Cloud!",
    voice_id: "voice_abc123",
    model_id: "eleven_multilingual_v2",
    voice_settings: {
      stability: 0.5,
      similarity_boost: 0.75,
    },
  }),
});

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);

Voice Settings

SettingRangeDescription
stability0-1Higher = more consistent, lower = more expressive
similarity_boost0-1How closely to match the original voice
style0-1Style exaggeration (v2 models only)
use_speaker_boostboolEnhance speaker similarity

Available Models

ModelLanguagesQualitySpeed
eleven_multilingual_v229HighestMedium
eleven_monolingual_v1EnglishHighFast
eleven_turbo_v2EnglishGoodFastest

Pre-built Voices

elizaOS Cloud provides pre-built voices:

bash
curl -X GET "https://elizacloud.ai/api/elevenlabs/voices" \
  -H "Authorization: Bearer YOUR_API_KEY"
json
{
  "voices": [
    {
      "voice_id": "21m00Tcm4TlvDq8ikWAM",
      "name": "Rachel",
      "labels": { "accent": "american", "age": "young" },
      "preview_url": "https://..."
    },
    {
      "voice_id": "AZnzlk1XvdvUeBnXmlld",
      "name": "Domi",
      "labels": { "accent": "american", "age": "young" }
    }
  ]
}

Voice Management

Get Voice Details

bash
curl -X GET "https://elizacloud.ai/api/elevenlabs/voices/voice_abc123" \
  -H "Authorization: Bearer YOUR_API_KEY"

Delete Voice

bash
curl -X DELETE "https://elizacloud.ai/api/elevenlabs/voices/voice_abc123" \
  -H "Authorization: Bearer YOUR_API_KEY"

Check Clone Status

bash
curl -X GET "https://elizacloud.ai/api/elevenlabs/voices/jobs" \
  -H "Authorization: Bearer YOUR_API_KEY"

Agent Integration

Use cloned voices with your agents:

json
{
  "name": "Voice Assistant",
  "bio": ["Helpful AI assistant with custom voice"],
  "settings": {
    "voice": {
      "provider": "elevenlabs",
      "voiceId": "voice_abc123",
      "model": "eleven_multilingual_v2"
    }
  }
}

Speech-to-Text

Convert audio to text:

bash
curl -X POST "https://elizacloud.ai/api/elevenlabs/stt" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]"
json
{
  "text": "This is the transcribed text from the audio.",
  "confidence": 0.95,
  "words": [
    { "word": "This", "start": 0.0, "end": 0.2, "confidence": 0.98 },
    { "word": "is", "start": 0.2, "end": 0.3, "confidence": 0.99 }
  ]
}

Pricing

See Billing & Credits and the dashboard/API Explorer for current voice pricing. Voice cloning and TTS/STT billing can vary by provider configuration.

<Callout type="info"> Monitor your voice usage in the [billing dashboard](/dashboard/billing). </Callout>

Best Practices

  • Quality Samples — Use high-quality, noise-free audio (44.1kHz+, minimal background noise)
  • Natural Speech — Include varied intonation, pacing, and emotional range in samples
  • Sufficient Length — Provide 1-3 minutes of audio for best clone quality
  • Test Thoroughly — Verify clone quality with diverse text before production use

Next Steps

<Cards> <Cards.Card title="AI Agents" href="/docs/agents"> Add voice to your agents </Cards.Card> <Cards.Card title="Video Generation" href="/docs/video-generation"> Add voiceovers to videos </Cards.Card> <Cards.Card title="API Reference" href="/docs/api"> Complete API documentation </Cards.Card> </Cards>