docs/whisperfile/gpu.md
GPU acceleration is most beneficial for the medium and large models. The tiny model is already fast on CPU, so the speedup there is minimal.
Pass --gpu auto to let whisperfile detect and use the best available GPU
on your system. If no supported GPU is found, it falls back to CPU silently:
whisperfile -m models/ggml-medium.en.bin -f audio.wav --gpu auto
You can also target a specific backend:
--gpu apple — Apple Metal (macOS, works on Apple Silicon and AMD GPUs)--gpu nvidia — NVIDIA CUDA (requires CUDA Toolkit to be installed)--gpu amd — AMD ROCm (requires ROCm to be installed on Linux)To disable GPU acceleration entirely:
whisperfile -m models/ggml-medium.en.bin -f audio.wav --no-gpu
ggml_backend_load_best: search path does not exist warnings
These are benign. They appear when whisperfile searches for GPU backend libraries and doesn't find them — usually because no GPU is present or configured. Transcription will continue on CPU. To suppress them, redirect stderr:
whisperfile -m models/ggml-medium.en.bin -f audio.wav 2>/dev/null