README.md
Sponsors: Recall.ai - Meeting Transcription API
If you’re looking for a transcription API for meetings, consider checking out Recall.ai , an API that works with Zoom, Google Meet, Microsoft Teams, and more
A Powerful Open Source Video Translation / Audio Transcription / AI Dubbing / Subtitle Translation Tool
中文 | Documentation | Online Q&A
</div>pyVideoTrans is dedicated to seamlessly converting videos from one language to another, offering a complete workflow that includes speech recognition, subtitle translation, multi-role dubbing, and audio-video synchronization. It supports both local offline deployment and a wide variety of mainstream online APIs.
We provide a pre-packaged .exe version for Windows 10/11 users, requiring no Python environment configuration.
D:\pyVideoTrans).sp.exe inside the folder to launch.Note:
- Do not run directly from within the compressed archive.
- To use GPU acceleration, ensure CUDA 12.8 and cuDNN 9.11 are installed.
We recommend using uv for package management for faster speed and better environment isolation.
brew install ffmpeg libsndfile gitsudo apt-get install ffmpeg libsndfile1-devffmpeg.exe and ffprobe.exe directly in the project directory.# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
git clone https://github.com/jianchang512/pyvideotrans.git
cd pyvideotrans
uv sync
By default,
qwen-tts,qwen-asr,moss-tts, andchatterboxare not installed locally.
- To install all optional channels:
uv sync --all-extra- To install individually:
uv sync --extra qwentts/uv sync --extra qwenasr/uv sync --extra mosstts/uv sync --extra chatterbox
GUI:
uv run sp.py
CLI:
# Video Translation
uv run cli.py --task vtv --name "./video.mp4" --source_language_code zh-cn --target_language_code en --voice_role "en-US-GuyNeural"
# Audio to Subtitle
uv run cli.py --task stt --name "./audio.wav" --model_name large-v3
# Subtitle Translation
uv run cli.py --task sts --name "./subs.srt" --target_language_code en
# Text to Speech
uv run cli.py --task tts --name "./subs.srt" --voice_role "zh-CN-YunyangNeural"
WebUI (for remote/internal network access):
uv sync --extra webui
uv run webui.py
Docker (containerized deployment):
# Build
docker build -t pyvideotrans-webui .
# Run
docker run -d -p 7860:7860 --name pyvideotrans pyvideotrans-webui
# With persistent config and output
docker run -d -p 7860:7860 \
-v ./data/output:/app/output \
-v ./data/config:/app/videotrans \
--name pyvideotrans pyvideotrans-webui
If you have an NVIDIA graphics card, execute the following commands to install the CUDA-supported PyTorch version:
# Uninstall CPU version
uv remove torch torchaudio
# Install CUDA version (Example for CUDA 12.x)
uv add torch==2.7 torchaudio==2.7 --index-url https://download.pytorch.org/whl/cu128
uv add nvidia-cublas-cu12 nvidia-cudnn-cu12
| Category | Channel/Model | Description |
|---|---|---|
| ASR (Speech Recognition) | Faster-Whisper (Local) | Recommended, fast speed, high accuracy |
| WhisperX / Parakeet | Supports timestamp alignment & speaker diarization | |
| Alibaba Qwen3-ASR / ByteDance Volcano | Online API, excellent for Chinese | |
| Translation (LLM/MT) | DeepSeek / ChatGPT | Supports context understanding, more natural translation |
| MiniMax AI | MiniMax M3 LLM, latest flagship model, OpenAI-compatible | |
| Google / Microsoft | Traditional machine translation, fast speed | |
| Ollama / M2M100 | Fully local offline translation | |
| TTS (Speech Synthesis) | Edge-TTS | Microsoft free interface, natural effect |
| F5-TTS / CosyVoice | Supports Voice Cloning, requires local deployment | |
| GPT-SoVITS / ChatTTS | High-quality open-source TTS | |
| 302.AI / OpenAI / Azure | High-quality commercial API |
This software is an open-source, free, non-commercial project. Users are solely responsible for any legal consequences arising from the use of this software (including but not limited to calling third-party APIs or processing copyrighted video content). Please comply with local laws and regulations and the terms of use of relevant service providers.
This project mainly relies on the following open-source projects (partial):
Created by jianchang512