docs/whisper_net_setup.md
Whisper.NET 是一个语音识别引擎,可以让你的 AMD 显卡通过 Vulkan 加速来识别语音(不用 NVIDIA 的 CUDA)。
适合 Windows + AMD 显卡 用户使用。
背景说明:我在使用 Whisper.cpp 时发现,最新版本已经不再提供 Windows 环境下 AMD 显卡的 GPU 支持,导致 Windows + AMD 显卡只能使用 CPU 进行语音识别,速度很慢。经过与 AI 的咨询和讨论,最终选择了 Whisper.NET 这条技术路线,并借助 AI 的编程能力得以实现。
测试环境:目前仅在 Windows 11 23H2 + RX 6650 XT 环境下测试通过。其他环境可能需要用户自行测试,欢迎反馈结果。
托管 DLL(下载后放到 deps/ 文件夹):
| 文件名 | 版本 | 下载链接 |
|---|---|---|
| Whisper.net.dll | 1.9.0 | 点击下载 |
| Microsoft.Extensions.AI.Abstractions.dll | 10.0.0 | 点击下载 |
| Microsoft.Bcl.AsyncInterfaces.dll | 10.0.0 | 点击下载 |
| System.Memory.dll | 4.6.3 | 点击下载 |
| System.Buffers.dll | 4.6.1 | 点击下载 |
| System.Runtime.CompilerServices.Unsafe.dll | 6.1.2 | 点击下载 |
| System.Numerics.Vectors.dll | 4.6.1 | 点击下载 |
Native DLL(下载后放到 deps/native/ 文件夹):
从 Whisper.net.Runtime.Vulkan 1.9.0 下载,解压后把 build/win-x64/ 文件夹里的所有 DLL 文件都复制到 deps/native/。
NuGet 包里包含这些文件(全部需要):
| 文件名 | 大小 | 用途 |
|---|---|---|
| whisper.dll | 473KB | 语音识别核心 |
| libwhisper.dll | 473KB | whisper.dll 的别名(必需) |
| ggml-whisper.dll | 66KB | 计算库 |
| libggml-whisper.dll | 66KB | ggml-whisper.dll 的别名 |
| ggml-base-whisper.dll | 528KB | 基础库(必需依赖) |
| libggml-base-whisper.dll | 528KB | ggml-base-whisper.dll 的别名 |
| ggml-cpu-whisper.dll | 590KB | CPU 后备 |
| libggml-cpu-whisper.dll | 590KB | ggml-cpu-whisper.dll 的别名 |
| ggml-vulkan-whisper.dll | 45MB | GPU 加速(Vulkan) |
| libggml-vulkan-whisper.dll | 45MB | ggml-vulkan-whisper.dll 的别名 |
.nupkg 文件.nupkg 文件后缀改成 .zip,用解压软件打开lib/netstandard2.0/ 文件夹里build/win-x64/ 文件夹里从 ggerganov/whisper.cpp models 下载 .bin 格式的模型文件,放到 models/ 文件夹。
比如下载:ggml-large-v3-turbo.bin(效果好、速度快)
确保你的目录结构是这样的:
pyvideotrans/
├─ models/
│ └─ ggml-large-v3-turbo.bin ← 语音模型
└─ deps/
├─ Whisper.net.dll ← 下面 7 个是托管 DLL
├─ Microsoft.Extensions.AI.Abstractions.dll
├─ Microsoft.Bcl.AsyncInterfaces.dll
├─ System.Memory.dll
├─ System.Buffers.dll
├─ System.Runtime.CompilerServices.Unsafe.dll
├─ System.Numerics.Vectors.dll
└─ native/ ← 把 NuGet 包里 build/win-x64/ 的所有 DLL 复制到这里
├─ whisper.dll
├─ libwhisper.dll
├─ ggml-whisper.dll
├─ libggml-whisper.dll
├─ ggml-base-whisper.dll
├─ libggml-base-whisper.dll
├─ ggml-cpu-whisper.dll
├─ libggml-cpu-whisper.dll
├─ ggml-vulkan-whisper.dll
└─ libggml-vulkan-whisper.dll
uv sync --all-extras,如果已安装,请单独执行uv sync --extra dotnet 安装 pythonnet 模块uv run sp.py 打开软件0x8007007Edeps/native/ 文件夹里是否有 10 个 DLL 文件打开命令行,输入:
vulkaninfo
如果显示显卡信息就说明支持。
Whisper.NET is a speech recognition engine that uses Vulkan acceleration for AMD GPUs (no NVIDIA CUDA required).
Designed for Windows + AMD GPU users.
Background: While using Whisper.cpp, I discovered that the latest version no longer provides AMD GPU support on Windows. This means Windows + AMD GPU users can only use CPU for speech recognition, which is very slow. After consulting and discussing with AI, I chose the Whisper.NET approach and implemented it with AI assistance.
Tested Environment: Currently only tested on Windows 11 23H2 + RX 6650 XT. Other environments may need further testing. Feedback is welcome.
Managed DLLs (place in deps/ folder):
| File | Version | Download Link |
|---|---|---|
| Whisper.net.dll | 1.9.0 | Download |
| Microsoft.Extensions.AI.Abstractions.dll | 10.0.0 | Download |
| Microsoft.Bcl.AsyncInterfaces.dll | 10.0.0 | Download |
| System.Memory.dll | 4.6.3 | Download |
| System.Buffers.dll | 4.6.1 | Download |
| System.Runtime.CompilerServices.Unsafe.dll | 6.1.2 | Download |
| System.Numerics.Vectors.dll | 4.6.1 | Download |
Native DLLs (place in deps/native/ folder):
Download from Whisper.net.Runtime.Vulkan 1.9.0, extract and copy all DLL files from build/win-x64/ folder to deps/native/.
The NuGet package contains these files (all required):
| File | Size | Purpose |
|---|---|---|
| whisper.dll | 473KB | Speech recognition core |
| libwhisper.dll | 473KB | Alias for whisper.dll (required) |
| ggml-whisper.dll | 66KB | Compute library |
| libggml-whisper.dll | 66KB | Alias for ggml-whisper.dll |
| ggml-base-whisper.dll | 528KB | Base library (required dependency) |
| libggml-base-whisper.dll | 528KB | Alias for ggml-base-whisper.dll |
| ggml-cpu-whisper.dll | 590KB | CPU backend |
| libggml-cpu-whisper.dll | 590KB | Alias for ggml-cpu-whisper.dll |
| ggml-vulkan-whisper.dll | 45MB | GPU acceleration (Vulkan) |
| libggml-vulkan-whisper.dll | 45MB | Alias for ggml-vulkan-whisper.dll |
.nupkg file.nupkg file extension to .zip and open with any archive toollib/netstandard2.0/ folderbuild/win-x64/ folderDownload .bin format model files from ggerganov/whisper.cpp models and place them in the models/ folder.
Recommended: ggml-large-v3-turbo.bin (good quality, fast)
Make sure your directory structure looks like this:
pyvideotrans/
├─ models/
│ └─ ggml-large-v3-turbo.bin ← Speech model
└─ deps/
├─ Whisper.net.dll ← Managed DLLs (7 files)
├─ Microsoft.Extensions.AI.Abstractions.dll
├─ Microsoft.Bcl.AsyncInterfaces.dll
├─ System.Memory.dll
├─ System.Buffers.dll
├─ System.Runtime.CompilerServices.Unsafe.dll
├─ System.Numerics.Vectors.dll
└─ native/ ← Copy all DLLs from build/win-x64/ here
├─ whisper.dll
├─ libwhisper.dll
├─ ggml-whisper.dll
├─ libggml-whisper.dll
├─ ggml-base-whisper.dll
├─ libggml-base-whisper.dll
├─ ggml-cpu-whisper.dll
├─ libggml-cpu-whisper.dll
├─ ggml-vulkan-whisper.dll
└─ libggml-vulkan-whisper.dll
0x8007007Edeps/native/ folder contains all 10 DLL filesOpen command line and run:
vulkaninfo
If it displays your GPU information, Vulkan is supported.