README.md
Sponsors: Recall.ai - Meeting Transcription API
If you’re looking for a transcription API for meetings, consider checking out Recall.ai , an API that works with Zoom, Google Meet, Microsoft Teams, and more
A Powerful Open Source Video Translation / Audio Transcription / AI Dubbing / Subtitle Translation Tool
中文 | Documentation | Online Q&A
</div>pyVideoTrans is dedicated to seamlessly converting videos from one language to another, offering a complete workflow that includes speech recognition, subtitle translation, multi-role dubbing, and audio-video synchronization. It supports both local offline deployment and a wide variety of mainstream online APIs.
We provide a pre-packaged .exe version for Windows 10/11 users, requiring no Python environment configuration.
D:\pyVideoTrans).sp.exe inside the folder to launch.Note:
- Do not run directly from within the compressed archive.
- To use GPU acceleration, ensure CUDA 12.8 and cuDNN 9.11 are installed.
We recommend using uv for package management for faster speed and better environment isolation.
brew install ffmpeg libsndfile gitsudo apt-get install ffmpeg libsndfile1-devffmpeg.exe and ffprobe.exe directly in the project directory.# macOS/Linux
curl -LsSf https://astral.sh/uv/install.sh | sh
# Windows (PowerShell)
powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
# 1. Clone the repository (Ensure path has no spaces/Chinese characters)
git clone https://github.com/jianchang512/pyvideotrans.git
cd pyvideotrans
# 2. Install dependencies (uv automatically syncs environment)
uv sync
# If you need local channels for qwen-tts and qwen-asr, please execute `uv sync --extra qwen-tts --extra qwen-asr`
Launch GUI:
uv run sp.py
Use CLI:
# Video Translation Example
uv run cli.py --task vtv --name "./video.mp4" --source_language_code zh --target_language_code en
# Audio to Subtitle Example
uv run cli.py --task stt --name "./audio.wav" --model_name large-v3
If you have an NVIDIA graphics card, execute the following commands to install the CUDA-supported PyTorch version:
# Uninstall CPU version
uv remove torch torchaudio
# Install CUDA version (Example for CUDA 12.x)
uv add torch==2.7 torchaudio==2.7 --index-url https://download.pytorch.org/whl/cu128
uv add nvidia-cublas-cu12 nvidia-cudnn-cu12
| Category | Channel/Model | Description |
|---|---|---|
| ASR (Speech Recognition) | Faster-Whisper (Local) | Recommended, fast speed, high accuracy |
| WhisperX / Parakeet | Supports timestamp alignment & speaker diarization | |
| Alibaba Qwen3-ASR / ByteDance Volcano | Online API, excellent for Chinese | |
| Translation (LLM/MT) | DeepSeek / ChatGPT | Supports context understanding, more natural translation |
| MiniMax AI | MiniMax M2.7 LLM, latest flagship model, OpenAI-compatible | |
| Google / Microsoft | Traditional machine translation, fast speed | |
| Ollama / M2M100 | Fully local offline translation | |
| TTS (Speech Synthesis) | Edge-TTS | Microsoft free interface, natural effect |
| F5-TTS / CosyVoice | Supports Voice Cloning, requires local deployment | |
| GPT-SoVITS / ChatTTS | High-quality open-source TTS | |
| 302.AI / OpenAI / Azure | High-quality commercial API |
This software is an open-source, free, non-commercial project. Users are solely responsible for any legal consequences arising from the use of this software (including but not limited to calling third-party APIs or processing copyrighted video content). Please comply with local laws and regulations and the terms of use of relevant service providers.
This project mainly relies on the following open-source projects (partial):
Created by jianchang512