readme/README_ben.md
āĻ āϧā§āϝāĻžāϝāĻŧ āĻ āĻŽā§āĻāĻžāĻĄā§āĻāĻž āϏāĻš āĻ-āĻŦā§āĻ āĻĨā§āĻā§ āĻ āĻĄāĻŋāĻāĻŦā§āĻā§ āϰā§āĻĒāĻžāύā§āϤāϰāĻāĻžāϰ⧠CPU/GPU āĻāύāĻāĻžāϰā§āĻāĻžāϰ
āĻāύā§āύāϤ TTS āĻāĻā§āĻāĻŋāύ āĻāϤā§āϝāĻžāĻĻāĻŋ āĻŦā§āϝāĻŦāĻšāĻžāϰ āĻāϰā§āĨ¤
āĻāϝāĻŧā§āϏ āĻā§āϞā§āύāĻŋāĻ āĻāĻŦāĻ ā§§ā§§ā§Ģā§ŽāĻāĻŋ āĻāĻžāώāĻž āϏāĻŽāϰā§āĻĨāύ āĻāϰā§!
[!IMPORTANT] āĻāĻ āĻā§āϞāĻāĻŋ āĻā§āĻŦāϞ DRM-āĻŽā§āĻā§āϤ, āĻŦā§āϧāĻāĻžāĻŦā§ āĻ āϰā§āĻāĻŋāϤ āĻ-āĻŦā§āĻā§āϰ āϏāĻžāĻĨā§ āĻŦā§āϝāĻŦāĻšāĻžāϰā§āϰ āĻāύā§āϝ āϤā§āϰāĻŋ āĻāϰāĻž āĻšāϝāĻŧā§āĻā§āĨ¤
āϞā§āĻāĻāĻāĻŖ āĻāĻ āϏāĻĢāĻāĻāϝāĻŧā§āϝāĻžāϰā§āϰ āĻā§āύ⧠āĻ āĻĒāĻŦā§āϝāĻŦāĻšāĻžāϰ āĻŦāĻž āϤāĻž āĻĨā§āĻā§ āĻāĻĻā§āĻā§āϤ āĻā§āύ⧠āĻāĻāύāĻŋ āĻĒāϰāĻŋāĻŖāϤāĻŋāϰ āĻāύā§āϝ āĻĻāĻžāϝāĻŧā§ āύāύāĨ¤
āĻāĻ āĻā§āϞāĻāĻŋ āĻĻāĻžāϝāĻŧāĻŋāϤā§āĻŦā§āϰ āϏāĻžāĻĨā§ āĻāĻŦāĻ āϏāĻŽāϏā§āϤ āĻĒā§āϰāϝā§āĻā§āϝ āĻāĻāύ āĻŽā§āύ⧠āĻŦā§āϝāĻŦāĻšāĻžāϰ āĻāϰā§āύāĨ¤
</a><a href="https://hub.docker.com/r/athomasson2/ebook2audiobook">
</a>āύāϤā§āύ āĻĄāĻŋāĻĢāϞā§āĻ āĻāϝāĻŧā§āϏā§āϰ āĻĄā§āĻŽā§
https://github.com/user-attachments/assets/750035dc-e355-46f1-9286-05c1d9e88cea
<details> <summary>āĻāϰāĻ āĻĄā§āĻŽā§</summary>ASMR āĻāϝāĻŧā§āϏ
https://github.com/user-attachments/assets/68eee9a1-6f71-4903-aacd-47397e47e422
āĻŦā§āώā§āĻāĻŋāϰ āĻĻāĻŋāύā§āϰ āĻāϝāĻŧā§āϏ
https://github.com/user-attachments/assets/d25034d9-c77f-43a9-8f14-0d167172b080
āϏā§āĻāĻžāϰāϞā§āĻ āĻāϝāĻŧā§āϏ
https://github.com/user-attachments/assets/b12009ee-ec0d-45ce-a1ef-b3a52b9f8693
āĻĄā§āĻāĻŋāĻĄ āĻ ā§āϝāĻžāĻā§āύāĻŦāϰ⧠āĻāϝāĻŧā§āϏ
https://github.com/user-attachments/assets/81c4baad-117e-4db5-ac86-efc2b7fea921
āĻāĻĻāĻžāĻšāϰāĻŖ
</details>āύā§āϝā§āύāϤāĻŽ āĻĒā§āϰāϝāĻŧā§āĻāύā§āϝāĻŧāϤāĻž
āĻĒā§āϰāύ⧠āϏāĻāϏā§āĻāϰāĻŖā§ āĻĢāĻŋāϰ⧠āϝāĻžāĻāϝāĻŧāĻž
đ§ āϏāĻŽāϰā§āĻĨāĻŋāϤ TTS āĻāĻā§āĻāĻŋāύ: XTTSv2, Bark, Fairseq, VITS, Tacotron2, Tortoise, GlowTTS, YourTTS
đ āĻāĻāĻžāϧāĻŋāĻ āĻĢāĻžāĻāϞ āĻĢāϰāĻŽā§āϝāĻžāĻ āϰā§āĻĒāĻžāύā§āϤāϰ āĻāϰā§āύ: .epub, .mobi, .azw3, .fb2, .lrf, .rb, .snb, .tcr, .pdf, .txt, .rtf, .doc, .docx, .html, .odt, .azw, .tiff, .tif, .png, .jpg, .jpeg, .bmp, .zip
đģ āϏāĻāĻā§āώāĻŋāĻĒā§āϤ āĻā§āĻā§āϏāĻ āϏāϰāĻžāϏāϰāĻŋ āĻ āĻĄāĻŋāĻāϤ⧠āϰā§āĻĒāĻžāύā§āϤāϰ āĻāϰāĻžāϰ āĻāύā§āϝ āĻā§āĻā§āϏāĻ āĻāϰāĻŋāϝāĻŧāĻž
đ āĻāĻŦāĻŋ āĻšāĻŋāϏā§āĻŦā§ āĻā§āĻā§āϏāĻ āĻĒā§āώā§āĻ āĻž āϏāĻš āĻĢāĻžāĻāϞā§āϰ āĻāύā§āϝ OCR āϏā§āĻā§āϝāĻžāύāĻŋāĻ
đ āĻāĻā§āĻāĻŽāĻžāύā§āϰ āĻā§āĻā§āϏāĻ-āĻā§-āϏā§āĻĒāĻŋāĻ, āĻĒā§āϰāĻžāϝāĻŧ āϰāĻŋāϝāĻŧā§āϞ-āĻāĻžāĻāĻŽ āĻĨā§āĻā§ āĻĒā§āϰāĻžāϝāĻŧ āĻŦāĻžāϏā§āϤāĻŦ āĻāϝāĻŧā§āϏ āĻĒāϰā§āϝāύā§āϤ
đŖī¸ āĻāĻĒāύāĻžāϰ āύāĻŋāĻā§āϰ āĻāϝāĻŧā§āϏ āĻĢāĻžāĻāϞ āĻŦā§āϝāĻŦāĻšāĻžāϰ āĻāϰ⧠āĻāĻā§āĻāĻŋāĻ āĻāϝāĻŧā§āϏ āĻā§āϞā§āύāĻŋāĻ
đ ā§§ā§§ā§Ģā§ŽāĻāĻŋ āĻāĻžāώāĻž āϏāĻŽāϰā§āĻĨāύ āĻāϰ⧠(supported languages list)
đģ āĻāĻŽ āϏāĻŽā§āĻĒāĻĻ-āĻŦāĻžāύā§āϧāĻŦ â 2 GB RAM / 1 GB VRAM (āύā§āϝā§āύāϤāĻŽ)-āĻ āĻāϞā§
đĩ āĻ
āĻĄāĻŋāĻāĻŦā§āĻ āĻāĻāĻāĻĒā§āĻ āĻĢāϰāĻŽā§āϝāĻžāĻ: mono or stereo aac, flac, mp3, m4b, m4a, mp4, mov, ogg, wav, webm
đ§ SML āĻā§āϝāĻžāĻ āϏāĻŽāϰā§āĻĨāĻŋāϤ â āĻŦāĻŋāϰāϤāĻŋ, āĻĨāĻžāĻŽāĻž, āĻāϝāĻŧā§āϏ āĻĒāϰāĻŋāĻŦāϰā§āϤāύ āĻāϤā§āϝāĻžāĻĻāĻŋāϰ āϏā§āĻā§āώā§āĻŽ āύāĻŋāϝāĻŧāύā§āϤā§āϰāĻŖ (see below)
đ§Š āĻāĻĒāύāĻžāϰ āύāĻŋāĻā§āϰ āĻĒā§āϰāĻļāĻŋāĻā§āώāĻŋāϤ āĻŽāĻĄā§āϞ āĻŦā§āϝāĻŦāĻšāĻžāϰ āĻāϰ⧠āĻāĻā§āĻāĻŋāĻ āĻāĻžāϏā§āĻāĻŽ āĻŽāĻĄā§āϞ (XTTSv2, VITS, FAIRSEQ, PIPER, others on request)
đī¸ E2A āĻāĻŋāĻŽ āĻĻā§āĻŦāĻžāϰāĻž āĻĒā§āϰāĻļāĻŋāĻā§āώāĻŋāϤ āĻĢāĻžāĻāύ-āĻāĻŋāĻāύ āĻāϰāĻž āĻĒā§āϰāĻŋāϏā§āĻ āĻŽāĻĄā§āϞ
<i>(āĻāĻĒāύāĻžāϰ āĻ āϤāĻŋāϰāĻŋāĻā§āϤ āĻĢāĻžāĻāύ-āĻāĻŋāĻāύ āĻāϰāĻž āĻŽāĻĄā§āϞā§āϰ āĻĒā§āϰāϝāĻŧā§āĻāύ āĻšāϞā§, āĻ āĻĨāĻŦāĻž āĻ āĻĢāĻŋāϏāĻŋāϝāĻŧāĻžāϞ āĻĒā§āϰāĻŋāϏā§āĻ āϤāĻžāϞāĻŋāĻāĻžāϝāĻŧ āĻāĻĒāύāĻžāϰ āĻŽāĻĄā§āϞ āĻļā§āϝāĻŧāĻžāϰ āĻāϰāϤ⧠āĻāĻžāĻāϞ⧠āĻāĻŽāĻžāĻĻā§āϰ āϏāĻžāĻĨā§ āϝā§āĻāĻžāϝā§āĻ āĻāϰā§āύ)</i>
*<i> āĻāϧā§āύāĻŋāĻ TTS āĻāĻā§āĻāĻŋāύ CPU-āϤ⧠āĻā§āĻŦ āϧā§āϰ, āϤāĻžāĻ YourTTS, Tacotron2 āĻāϤā§āϝāĻžāĻĻāĻŋāϰ āĻŽāϤ⧠āύāĻŋāĻŽā§āύāĻŽāĻžāύā§āϰ TTS āĻŦā§āϝāĻŦāĻšāĻžāϰ āĻāϰā§āύāĨ¤</i>
| Arabic (ar) | Chinese (zh) | English (en) | Spanish (es) |
|---|---|---|---|
| French (fr) | German (de) | Italian (it) | Portuguese (pt) |
| Polish (pl) | Turkish (tr) | Russian (ru) | Dutch (nl) |
| Czech (cs) | Japanese (ja) | Hindi (hi) | Bengali (bn) |
| Hungarian (hu) | Korean (ko) | Vietnamese (vi) | Swedish (sv) |
| Persian (fa) | Yoruba (yo) | Swahili (sw) | Indonesian (id) |
| Slovak (sk) | Croatian (hr) | Tamil (ta) | Danish (da) |
.epub, .pdf, .mobi, .txt, .html, .rtf, .chm, .lit,
.pdb, .fb2, .odt, .cbr, .cbz, .prc, .lrf, .pml,
.snb, .cbc, .rb, .tcr.epub āĻŦāĻž .mobi.m4b, .m4a, .mp4, .webm, .mov, .mp3, .flac, .wav, .ogg, .aac[break] â āύā§āϰāĻŦāϤāĻž (āĻāϞā§āĻŽā§āϞ⧠āĻĒāϰāĻŋāϏāϰ 0.3â0.6 sec.)[pause] â āύā§āϰāĻŦāϤāĻž (āĻāϞā§āĻŽā§āϞ⧠āĻĒāϰāĻŋāϏāϰ 1.0â1.6 sec.)[pause:N] â āύāĻŋāϰā§āĻĻāĻŋāώā§āĻ āĻŦāĻŋāϰāϤāĻŋ (N sec.)[voice:/path/to/voice/file]...[/voice] â āĻĄāĻŋāĻĢāϞā§āĻ āĻŦāĻž GUI/CLI āĻĨā§āĻā§ āύāĻŋāϰā§āĻŦāĻžāĻāĻŋāϤ āĻāϝāĻŧā§āϏ āĻĨā§āĻā§ āĻāϝāĻŧā§āϏ āĻĒāϰāĻŋāĻŦāϰā§āϤāύ āĻāϰā§āύāĻāĻĒāύāĻžāϰ āĻ-āĻŦā§āĻā§ āϏā§āĻŦāϝāĻŧāĻāĻā§āϰāĻŋāϝāĻŧāĻāĻžāĻŦā§ SML āϝā§āĻ āĻāϰāĻžāϰ āĻāύā§āϝ āύāĻŋāĻŦā§āĻĻāĻŋāϤ āĻāĻŽāĻžāĻĻā§āϰ āĻ āύā§āϝ āϰā§āĻĒā§āĻāĻŋ āĻĻā§āĻā§āύ -> E2A-SML
[!IMPORTANT] **āĻāύāϏā§āĻāϞā§āĻļāύ āĻŦāĻž āĻŦāĻžāĻ āϏāĻŽāϏā§āϝāĻž āĻĒā§āϏā§āĻ āĻāϰāĻžāϰ āĻāĻā§, āĻā§āϞāĻž āĻ āĻŦāύā§āϧ āϏāĻŽāϏā§āϝāĻžāϰ āĻā§āϝāĻžāĻŦā§ āϏāĻžāĻŦāϧāĻžāύ⧠āĻ āύā§āϏāύā§āϧāĻžāύ āĻāϰā§āύ
āĻāĻāĻž āύāĻŋāĻļā§āĻāĻŋāϤ āĻāϰāϤ⧠āϝ⧠āĻāĻĒāύāĻžāϰ āϏāĻŽāϏā§āϝāĻžāĻāĻŋ āĻāϤāĻŋāĻŽāϧā§āϝ⧠āύā§āĻāĨ¤**
[!NOTE] **EPUB āĻĢāϰāĻŽā§āϝāĻžāĻā§ āĻ āϧā§āϝāĻžāϝāĻŧ, āĻ āύā§āĻā§āĻā§āĻĻ, āĻā§āĻŽāĻŋāĻāĻž āĻāϤā§āϝāĻžāĻĻāĻŋ āĻā§ āϤāĻž āϏāĻāĻā§āĻāĻžāϝāĻŧāĻŋāϤ āĻāϰāĻžāϰ āĻā§āύ⧠āĻŽāĻžāύāĻ āĻāĻžāĻ āĻžāĻŽā§ āύā§āĻāĨ¤
āϤāĻžāĻ āĻāĻĒāύāĻžāĻā§ āĻĒā§āϰāĻĨāĻŽā§ āĻŽā§āϝāĻžāύā§āϝāĻŧāĻžāϞāĻŋ āϏā§āĻ āϏāĻŽāϏā§āϤ āĻā§āĻā§āϏāĻ āϏāϰāĻŋāϝāĻŧā§ āĻĢā§āϞāϤ⧠āĻšāĻŦā§ āϝāĻž āĻāĻĒāύāĻŋ āĻ āĻĄāĻŋāĻāϤ⧠āϰā§āĻĒāĻžāύā§āϤāϰ āĻāϰāϤ⧠āĻāĻžāύ āύāĻžāĨ¤**
āϰā§āĻĒā§ āĻā§āϞā§āύ āĻāϰā§āύ
git clone https://github.com/DrewThomasson/ebook2audiobook.git
cd ebook2audiobook
ebook2audiobook āĻāύāϏā§āĻāϞ / āĻāĻžāϞāĻžāύ:
Linux/MacOS
./ebook2audiobook.command
<i>MacOS āĻŦā§āϝāĻŦāĻšāĻžāϰāĻāĻžāϰā§āĻĻā§āϰ āĻāύā§āϝ āύā§āĻ: āĻ āύā§āĻĒāϏā§āĻĨāĻŋāϤ āĻĒā§āϰā§āĻā§āϰāĻžāĻŽ āĻāύāϏā§āĻāϞ āĻāϰāϤ⧠homebrew āĻāύāϏā§āĻāϞ āĻāϰāĻž āĻšāϝāĻŧāĨ¤</i>
Mac āϞāĻā§āĻāĻžāϰ
Mac Ebook2Audiobook Launcher.command-āĻ āĻĄāĻžāĻŦāϞ āĻā§āϞāĻŋāĻ āĻāϰā§āύ
Windows
ebook2audiobook.cmd
āĻ
āĻĨāĻŦāĻž
ebook2audiobook.cmd-āĻ āĻĄāĻžāĻŦāϞ āĻā§āϞāĻŋāĻ āĻāϰā§āύ
<i>Windows āĻŦā§āϝāĻŦāĻšāĻžāϰāĻāĻžāϰā§āĻĻā§āϰ āĻāύā§āϝ āύā§āĻ: āĻĒā§āϰāĻļāĻžāϏāĻ āĻ āύā§āĻŽāϤāĻŋ āĻāĻžāĻĄāĻŧāĻžāĻ āĻ āύā§āĻĒāϏā§āĻĨāĻŋāϤ āĻĒā§āϰā§āĻā§āϰāĻžāĻŽ āĻāύāϏā§āĻāϞ āĻāϰāϤ⧠scoop āĻāύāϏā§āĻāϞ āĻāϰāĻž āĻšāϝāĻŧāĨ¤</i>
āĻāϝāĻŧā§āĻŦ āĻ
ā§āϝāĻžāĻĒ āĻā§āϞā§āύ: āĻāϝāĻŧā§āĻŦ āĻ
ā§āϝāĻžāĻĒā§ āĻĒā§āϰāĻŦā§āĻļ āĻāϰāϤ⧠āĻāĻŦāĻ āĻ-āĻŦā§āĻ āϰā§āĻĒāĻžāύā§āϤāϰ āĻāϰāϤ⧠āĻāĻžāϰā§āĻŽāĻŋāύāĻžāϞ⧠āĻĒā§āϰāĻĻāϤā§āϤ URL-āĻ āĻā§āϞāĻŋāĻ āĻāϰā§āύāĨ¤ http://localhost:7860/
āĻĒāĻžāĻŦāϞāĻŋāĻ āϞāĻŋāĻā§āĻā§āϰ āĻāύā§āϝ:
./ebook2audiobook.command --share (Linux/MacOS)
ebook2audiobook.cmd --share (Windows)
python app.py --share (all OS)
[!IMPORTANT] **āϝāĻĻāĻŋ āϏā§āĻā§āϰāĻŋāĻĒā§āĻ āĻŦāύā§āϧ āĻāϰ⧠āĻāĻŦāĻžāϰ āĻāĻžāϞāĻžāύ⧠āĻšāϝāĻŧ, āϤāĻžāĻšāϞ⧠āĻāĻĒāύāĻžāĻā§ āĻāĻĒāύāĻžāϰ Gradio GUI āĻāύā§āĻāĻžāϰāĻĢā§āϏ āϰāĻŋāĻĢā§āϰā§āĻļ āĻāϰāϤ⧠āĻšāĻŦā§
āϝāĻžāϤ⧠āĻāϝāĻŧā§āĻŦ āĻĒā§āώā§āĻ āĻžāĻāĻŋ āύāϤā§āύ āĻāĻžāύā§āĻāĻļāύ āϏāĻā§āĻā§ āĻĒā§āύāϰāĻžāϝāĻŧ āϏāĻāϝā§āĻā§āϤ āĻšāϤ⧠āĻĒāĻžāϰā§āĨ¤**
Linux/MacOS:
./ebook2audiobook.command --headless --ebook <path_to_ebook_file> --voice <path_to_voice_file> --language <language_code>
Windows
ebook2audiobook.cmd --headless --ebook <path_to_ebook_file> --voice <path_to_voice_file> --language <language_code>
[--ebook]: āĻāĻĒāύāĻžāϰ āĻ-āĻŦā§āĻ āĻĢāĻžāĻāϞā§āϰ āĻĒāĻĨ
[--voice]: āĻāϝāĻŧā§āϏ āĻā§āϞā§āύāĻŋāĻ āĻĢāĻžāĻāϞ āĻĒāĻĨ (āĻāĻā§āĻāĻŋāĻ)
[--language]: ISO-639-3-āϤ⧠āĻāĻžāώāĻž āĻā§āĻĄ (āĻ āϰā§āĻĨāĻžā§: āĻāϤāĻžāϞā§āϝāĻŧāϰ āĻāύā§āϝ ita, āĻāĻāϰā§āĻāĻŋāϰ āĻāύā§āϝ eng, āĻāĻžāϰā§āĻŽāĻžāύā§āϰ āĻāύā§āϝ deu...)āĨ¤
āĻĄāĻŋāĻĢāϞā§āĻ āĻāĻžāώāĻž āĻšāϞ eng āĻāĻŦāĻ ./lib/lang.py-āϤ⧠āϏā§āĻ āĻāϰāĻž āĻĄāĻŋāĻĢāϞā§āĻ āĻāĻžāώāĻžāϰ āĻāύā§āϝ --language āĻāĻā§āĻāĻŋāĻāĨ¤
⧍ āĻ āĻā§āώāϰā§āϰ ISO-639-1 āĻā§āĻĄāĻ āϏāĻŽāϰā§āĻĨāĻŋāϤāĨ¤
(āĻāĻāĻŋ āĻāĻāĻāĻŋ .zip āĻĢāĻžāĻāϞ āĻšāϤ⧠āĻšāĻŦā§ āϝāĻžāϤ⧠āĻŦāĻžāϧā§āϝāϤāĻžāĻŽā§āϞāĻ āĻŽāĻĄā§āϞ āĻĢāĻžāĻāϞ āĻĨāĻžāĻā§āĨ¤ XTTSv2-āĻāϰ āĻāύā§āϝ āĻāĻĻāĻžāĻšāϰāĻŖ: config.json, model.pth, vocab.json āĻāĻŦāĻ ref.wav)
Linux/MacOS
./ebook2audiobook.command --headless --ebook <ebook_file_path> --language <language> --custom_model <custom_model_path>
Windows
ebook2audiobook.cmd --headless --ebook <ebook_file_path> --language <language> --custom_model <custom_model_path>
<i>āύā§āĻ: āĻāĻĒāύāĻžāϰ āĻāĻžāϏā§āĻāĻŽ āĻŽāĻĄā§āϞā§āϰ ref.wav āϏāϰā§āĻŦāĻĻāĻž āϰā§āĻĒāĻžāύā§āϤāϰā§āϰ āĻāύā§āϝ āύāĻŋāϰā§āĻŦāĻžāĻāĻŋāϤ āĻāϝāĻŧā§āϏ</i>
<custom_model_path>: model_name.zip āĻĢāĻžāĻāϞā§āϰ āĻĒāĻĨ,
āϝāĻžāϤ⧠(tts āĻāĻā§āĻāĻŋāύ āĻ
āύā§āϝāĻžāϝāĻŧā§) āϏāĻŽāϏā§āϤ āĻŦāĻžāϧā§āϝāϤāĻžāĻŽā§āϞāĻ āĻĢāĻžāĻāϞ āĻĨāĻžāĻāϤ⧠āĻšāĻŦā§
(āĻĻā§āĻā§āύ ./lib/models.py)āĨ¤
./ebook2audiobook.command --help
ebook2audiobook.cmd --help
python app.py --help <a id="help-command-output"></a>
usage: app.py [-h] [--session SESSION] [--share] [--headless] [--ebook EBOOK] [--ebooks_dir EBOOKS_DIR]
[--language LANGUAGE] [--voice VOICE] [--voice_map VOICE_MAP] [--device {CPU,CUDA,MPS,ROCM,XPU,JETSON}]
[--tts_engine {XTTS,BARK,VITS,FAIRSEQ,TACOTRON,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}]
[--custom_model CUSTOM_MODEL] [--fine_tuned FINE_TUNED] [--output_format OUTPUT_FORMAT]
[--output_channel OUTPUT_CHANNEL] [--temperature TEMPERATURE] [--length_penalty LENGTH_PENALTY]
[--num_beams NUM_BEAMS] [--repetition_penalty REPETITION_PENALTY] [--top_k TOP_K] [--top_p TOP_P]
[--speed SPEED] [--enable_text_splitting] [--text_temp TEXT_TEMP] [--waveform_temp WAVEFORM_TEMP]
[--output_dir OUTPUT_DIR] [--version]
Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the Gradio interface or run the script in headless mode for direct conversion.
options:
-h, --help show this help message and exit
--session SESSION Session to resume the conversion in case of interruption, crash,
or reuse of custom models and custom cloning voices.
**** The following option is for gradio/gui mode only:
--share (Optional) Enable a public shareable Gradio link.
**** The following options are for --headless mode only:
--headless Run the script in headless mode
--ebook EBOOK Path to the ebook file for conversion. Cannot be used when --ebooks_dir is present.
--ebooks_dir EBOOKS_DIR
Relative or absolute path of the directory containing the files to convert.
Cannot be used when --ebook is present.
--text TEXT Raw text for conversion. Cannot be used when --ebook or --ebooks_dir is present.
--language LANGUAGE Language of the e-book. Default language is set
in ./lib/lang.py sed as default if not present. All compatible language codes are in ./lib/lang.py
optional parameters:
--translate ISO3 (Optional) Translate ebook to a target language (ISO 639-3 code, e.g. eng, fra, deu) before TTS synthesis.
Uses argostranslate. The target language becomes the effective TTS language for the run.
A copy of the source ebook is made with the _<iso3> suffix so translated and non-translated
outputs stay isolated (independent process folder, audio chunks, and final file).
--voice VOICE (Optional) Path to the voice cloning file for TTS engine.
Uses the default voice if not present.
--voice_map VOICE_MAP
(Optional, --ebooks_dir only) Path to a JSON file mapping ebook path -> voice path.
Each entry overrides --voice for that specific ebook. Missing/null entries fall back to --voice.
Keys may be absolute paths or basenames. Example:
{"book1.epub": "/voices/eng/adult/female/alice.wav", "/abs/path/book2.epub": null}
--device {CPU,CUDA,MPS,ROCM,XPU,JETSON}
(Optional) Processor unit type for the conversion.
Default is set in ./lib/conf.py if not present. Fall back to CPU if CUDA or MPS is not available.
--tts_engine {XTTS,BARK,VITS,FAIRSEQ,TACOTRON,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}
(Optional) Preferred TTS engine (available are: ['XTTS', 'BARK', 'VITS', 'FAIRSEQ', 'TACOTRON', 'YOURTTS', 'xtts', 'bark', 'vits', 'fairseq', 'tacotron', 'yourtts'].
Default depends on the selected language. The tts engine should be compatible with the chosen language
--custom_model CUSTOM_MODEL
(Optional) Path to the custom model zip file cntaining mandatory model files.
Please refer to ./lib/models.py
--fine_tuned FINE_TUNED
(Optional) Fine tuned model path. Default is builtin model.
--output_format OUTPUT_FORMAT
(Optional) Output audio format. Default is m4b set in ./lib/conf.py
--output_channel OUTPUT_CHANNEL
(Optional) Output audio channel. Default is mono set in ./lib/conf.py
--temperature TEMPERATURE
(xtts only, optional) Temperature for the model.
Default to config.json model. Higher temperatures lead to more creative outputs.
--length_penalty LENGTH_PENALTY
(xtts only, optional) A length penalty applied to the autoregressive decoder.
Default to config.json model. Not applied to custom models.
--num_beams NUM_BEAMS
(xtts only, optional) Controls how many alternative sequences the model explores. Must be equal or greater than length penalty.
Default to config.json model.
--repetition_penalty REPETITION_PENALTY
(xtts only, optional) A penalty that prevents the autoregressive decoder from repeating itself.
Default to config.json model.
--top_k TOP_K (xtts only, optional) Top-k sampling.
Lower values mean more likely outputs and increased audio generation speed.
Default to config.json model.
--top_p TOP_P (xtts only, optional) Top-p sampling.
Lower values mean more likely outputs and increased audio generation speed. Default to config.json model.
--speed SPEED (xtts only, optional) Speed factor for the speech generation.
Default to config.json model.
--enable_text_splitting
(xtts only, optional) Enable TTS text splitting. This option is known to not be very efficient.
Default to config.json model.
--text_temp TEXT_TEMP
(bark only, optional) Text Temperature for the model.
Default to config.json model.
--waveform_temp WAVEFORM_TEMP
(bark only, optional) Waveform Temperature for the model.
Default to config.json model.
--output_dir OUTPUT_DIR
(Optional) Path to the output directory. Default is set in ./lib/conf.py
--version Show the version of the script and exit
Example usage:
Windows:
Gradio/GUI:
ebook2audiobook.cmd
Headless mode:
ebook2audiobook.cmd --headless --ebook '/path/to/file' --language eng
Linux/Mac:
Gradio/GUI:
./ebook2audiobook.command
Headless mode:
./ebook2audiobook.command --headless --ebook '/path/to/file' --language eng
SML tags available:
[break] â silence (random range **0.3â0.6 sec.**)
[pause] â silence (random range **1.0â1.6 sec.**)
[pause:N] â fixed pause (**N sec.**)
[voice:/path/to/voice/file]...[/voice] â switch voice from default or selected voice from GUI/CLI
āύā§āĻ: gradio/gui āĻŽā§āĻĄā§, āĻāϞāĻŽāĻžāύ āϰā§āĻĒāĻžāύā§āϤāϰ āĻŦāĻžāϤāĻŋāϞ āĻāϰāϤā§, āĻā§āĻŦāϞ āĻ-āĻŦā§āĻ āĻāĻĒāϞā§āĻĄ āĻāĻĒāĻžāĻĻāĻžāύā§āϰ [X]-āĻ āĻā§āϞāĻŋāĻ āĻāϰā§āύāĨ¤ āĻāĻŋāĻĒ: āϝāĻĻāĻŋ āĻāĻāĻā§ āĻŦā§āĻļāĻŋ āĻŦāĻŋāϰāϤāĻŋāϰ āĻĒā§āϰāϝāĻŧā§āĻāύ āĻšāϝāĻŧ, āϤāĻžāĻšāĻ˛ā§ ā§Š āϏā§āĻā§āύā§āĻĄā§āϰ āĻāύā§āϝ '[pause:3]' āϝā§āĻ āĻāϰā§āύ āĻāϤā§āϝāĻžāĻĻāĻŋāĨ¤
git clone https://github.com/DrewThomasson/ebook2audiobook.git
cd ebook2audiobook
Windows:
Docker:
ebook2audiobook.cmd --script_mode build_docker
Docker Compose:
ebook2audiobook.cmd --script_mode build_docker --docker_mode compose
Podman Compose:
ebook2audiobook.cmd --script_mode build_docker --docker_mode podman
Linux/Mac
Docker:
./ebook2audiobook.command --script_mode build_docker
Docker Compose
./ebook2audiobook.command --script_mode build_docker --docker_mode compose
Podman Compose:
./ebook2audiobook.command --script_mode build_docker --docker_mode podman
Docker run image:
Gradio/GUI:
CPU:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" --rm -it -p 7860:7860 athomasson2/ebook2audiobook:cpu
CUDA:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" --gpus all --rm -it -p 7860:7860 athomasson2/ebook2audiobook:cu[118/122/124/126 etc..]
ROCM:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" --device=/dev/kfd --device=/dev/dri --rm -it -p 7860:7860 athomasson2/ebook2audiobook:rocm[6.0/6.1/6.4 etc..]
XPU:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" --device=/dev/dri --rm -it -p 7860:7860 athomasson2/ebook2audiobook:xpu
JETSON:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" --runtime nvidia --rm -it -p 7860:7860 athomasson2/ebook2audiobook:jetson[51/60/61 etc...]
Headless mode:
CPU:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" -v "/my/real/ebooks/folder/absolute/path:/app/another_ebook_folder" --rm -it -p 7860:7860 ebook2audiobook:cpu --headless --ebook "/app/another_ebook_folder/myfile.pdf" [--voice /app/my/voicepath/voice.mp3 etc..]
CUDA:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" -v "/my/real/ebooks/folder/absolute/path:/app/another_ebook_folder" --gpus all --rm -it -p 7860:7860 ebook2audiobook:cu[118/122/124/126 etc..] --headless --ebook "/app/another_ebook_folder/myfile.pdf" [--voice /app/my/voicepath/voice.mp3 etc..]
ROCM:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" -v "/my/real/ebooks/folder/absolute/path:/app/another_ebook_folder" --device=/dev/kfd --device=/dev/dri --rm -it -p 7860:7860 ebook2audiobook:rocm[6.0/6.1/6.4 etc.] --headless --ebook "/app/another_ebook_folder/myfile.pdf" [--voice /app/my/voicepath/voice.mp3 etc..]
XPU:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" -v "/my/real/ebooks/folder/absolute/path:/app/another_ebook_folder" --device=/dev/dri --rm -it -p 7860:7860 ebook2audiobook:xpu --headless --ebook "/app/another_ebook_folder/myfile.pdf" [--voice /app/my/voicepath/voice.mp3 etc..]
JETSON:
docker run -v "./ebooks:/app/ebooks" -v "./audiobooks:/app/audiobooks" -v "./models:/app/models" -v "./voices:/app/voices" -v "./tmp:/app/tmp" -v "/my/real/ebooks/folder/absolute/path:/app/another_ebook_folder" --runtime nvidia --rm -it -p 7860:7860 ebook2audiobook:jetson[51/60/61 etc.] --headless --ebook "/app/another_ebook_folder/myfile.pdf" [--voice /app/my/voicepath/voice.mp3 etc..]
Docker Compose (i.e. cuda 12.8:
Run Gradio GUI:
DEVICE_TAG=cu128 docker compose --profile gpu up --no-log-prefix
Run Headless mode:
DEVICE_TAG=cu128 docker compose --profile gpu run --rm ebook2audiobook --headless --ebook "/app/ebooks/myfile.pdf" --voice /app/voices/eng/adult/female/some_voice.wav etc..
Podman Compose (i.e. cuda 12.8:
Run Gradio GUI:
DEVICE_TAG=cu128 podman-compose -f podman-compose.yml --profile gpu up
Run Headless mode:
DEVICE_TAG=cu128 podman-compose -f podman-compose.yml --profile gpu run --rm ebook2audiobook-gpu --headless --ebook "/app/ebooks/myfile.pdf" --voice /app/voices/eng/adult/female/some_voice.wav etc..
āĻāĻžāϏā§āĻāĻŽ XTTSv2 āĻŽāĻĄā§āϞā§āϰ āĻāύā§āϝ, āĻāϝāĻŧā§āϏā§āϰ āĻāĻāĻāĻŋ āϰā§āĻĢāĻžāϰā§āύā§āϏ āĻ āĻĄāĻŋāĻ āĻā§āϞāĻŋāĻĒ āĻŦāĻžāϧā§āϝāϤāĻžāĻŽā§āϞāĻ:
āĻāĻĒāύāĻŋ āϝ⧠āϏā§āĻāĻŋāĻāϏ āĻāĻžāύ āϤāĻž āϝā§āĻ āĻŦāĻž āĻ āĻĒāϏāĻžāϰāĻŖ āĻāϰāϤ⧠libs/conf.py āĻ āĻŦāĻžāϧ⧠āĻĒāϰāĻŋāĻŦāϰā§āϤāύ āĻāϰāϤ⧠āĻĒāĻžāϰā§āύāĨ¤ āĻāĻĒāύāĻŋ āϝāĻĻāĻŋ āĻāĻāĻŋ āĻāϰāĻžāϰ āĻĒāϰāĻŋāĻāϞā§āĻĒāύāĻž āĻāϰā§āύ, āϤāĻžāĻšāϞ⧠āĻā§āĻŦāϞ āĻŽā§āϞ conf.py-āĻāϰ āĻāĻāĻāĻŋ āĻāĻĒāĻŋ āϤā§āϰāĻŋ āĻāϰā§āύ āϝāĻžāϤ⧠āĻĒā§āϰāϤāĻŋāĻāĻŋ ebook2audiobook āĻāĻĒāĻĄā§āĻā§ āĻāĻĒāύāĻŋ āĻāĻĒāύāĻžāϰ āĻĒāϰāĻŋāĻŦāϰā§āϤāĻŋāϤ conf.py āĻŦā§āϝāĻžāĻāĻāĻĒ āĻāϰāϤ⧠āĻāĻŦāĻ āĻŽā§āϞāĻāĻŋ āĻĢāĻŋāϰāĻŋāϝāĻŧā§ āϰāĻžāĻāϤ⧠āĻĒāĻžāϰā§āύāĨ¤ āĻāĻĒāύāĻžāĻā§ models.py-āĻāϰ āĻāύā§āϝāĻ āĻāĻāĻ āĻĒā§āϰāĻā§āϰāĻŋāϝāĻŧāĻž āĻĒāϰāĻŋāĻāϞā§āĻĒāύāĻž āĻāϰāϤ⧠āĻšāĻŦā§āĨ¤ āĻāĻĒāύāĻŋ āϝāĻĻāĻŋ āĻāĻĒāύāĻžāϰ āύāĻŋāĻā§āϰ āĻāĻžāϏā§āĻāĻŽ āĻŽāĻĄā§āϞāĻā§ āĻāĻāĻāĻŋ āĻ āĻĢāĻŋāϏāĻŋāϝāĻŧāĻžāϞ āĻĢāĻžāĻāύ-āĻāĻŋāĻāύ āĻāϰāĻž ebook2audiobook āĻŽāĻĄā§āϞ āĻŦāĻžāύāĻžāϤ⧠āĻāĻžāύ, āϤāĻžāĻšāϞ⧠āĻ āύā§āĻā§āϰāĻš āĻāϰ⧠āĻāĻŽāĻžāĻĻā§āϰ āϏāĻžāĻĨā§ āϝā§āĻāĻžāϝā§āĻ āĻāϰā§āύ āĻāĻŦāĻ āĻāĻŽāϰāĻž āĻāĻāĻŋ āĻĒā§āϰāĻŋāϏā§āĻ āϤāĻžāϞāĻŋāĻāĻžāϝāĻŧ āϝā§āĻ āĻāϰāĻŦāĨ¤
āϰāĻŋāϞāĻŋāĻāĻā§āϞāĻŋ āĻĒāĻžāĻāϝāĻŧāĻž āϝāĻžāĻŦā§ -> āĻāĻāĻžāύā§
git checkout tags/VERSION_NUM # Locally/Compose -> Example: git checkout tags/v25.7.7
--help āĻĒā§āϝāĻžāϰāĻžāĻŽāĻŋāĻāĻžāϰ āϝā§āĻ āĻāϰā§āύāĨ¤āĻāĻĻāĻžāĻšāϰāĻŖ:
import json
from typing import Optional
def get_user(user_id:int, users:list[dict])->Optional[dict]:
for user in users:
if user['id'] == user_id:
return user
return None
def summarize(user:dict)->str:
return f"User {user['name']} is {'active' if user['is_active'] else 'inactive'}."
def to_json(user:dict)->str:
return json.dumps({"id": user['id'], "name": user['name'], "email": user['email']})
users:list = [
dict(id=1, name="alice", email="[email protected]", role="admin", is_active=True),
dict(id=2, name="bob", email="[email protected]", role="editor", is_active=False),
dict(id=3, name="carol", email="[email protected]", role="viewer", is_active=True),
]
config = {
"max_users": 100,
"default_role": "viewer",
"allow_signup": True,
}
roles = ['admin', 'editor', 'viewer']
found = get_user(1, users)
if found:
print(summarize(found))
print(found['email'])
print(to_json(found))
if config['default_role'] in roles:
print(config['default_role'])
āĻāĻŽāϰāĻž āĻāĻŽāĻžāĻĻā§āϰ āĻāύā§āύāϝāĻŧāύ āĻĒāϰā§āĻā§āώāĻž āĻāϰāĻžāϰ āĻāύā§āϝ āϝā§āĻā§āύ⧠āϧāϰāύā§āϰ āĻšāĻžāϰā§āĻĄāĻāϝāĻŧā§āϝāĻžāϰ āĻā§āϰāĻšāĻŖ āĻāϰāĻŋ āϝā§āĻŽāύ:
@DrewThomasson āĻāĻĒāύāĻŋ āϝāĻĻāĻŋ āϝā§āĻā§āύ⧠āĻāĻĒāĻžāϝāĻŧā§ āϏāĻžāĻšāĻžāϝā§āϝ āĻāϰāϤ⧠āĻāĻžāύ! đ
<!-- ## āĻāĻŽāĻžāĻĻā§āϰ āĻĒāϰāĻŋāώā§āĻŦāĻž āĻŦāĻžāĻĄāĻŧāĻžāϤ⧠āĻāĻĒāύāĻžāϰ āĻāĻŋ āĻāĻāĻāĻŋ GPU āĻāĻžāĻĄāĻŧāĻž āύā§āĻāϝāĻŧāĻžāϰ āĻĒā§āϰāϝāĻŧā§āĻāύ? - āĻāĻāĻžāύ⧠āĻāĻāĻāĻŋ āĻāϰāĻŋāĻĒ āĻā§āϞāĻž āĻāĻā§ https://github.com/DrewThomasson/ebook2audiobook/discussions/889 -->