examples/python/ollama-voice/README.md
Example of using the Moonshine Voice library to transcribe speech from your microphone and send it to an Ollama LLM chat interface. As you speak, your words are transcribed in real time; when you finish a segment, it is sent to the selected Ollama model and the streamed reply is printed to the console.
ollama serve), with at least one model pulled (e.g. ollama pull qwen3.5:9b)Install the Moonshine Voice Python package (if you haven’t already):
pip install moonshine-voice
Install the Ollama Python client:
pip install ollama
Start Ollama (if not already running):
ollama serve
Run the example in another terminal:
python ollama_voice.py
Speak into the microphone. When a segment is finalized, it is sent to the default Ollama model and the response is streamed to the terminal. Press Ctrl+C to stop.
| Option | Default | Description |
|---|---|---|
--ollama-model | qwen3.5:9b | Ollama model name for chat responses |
--language | en | Language for Moonshine transcription |
--moonshine-model-arch | (auto) | Moonshine model architecture to use |
Examples:
python ollama_voice.py --ollama-model llama3.2
python ollama_voice.py --language en --ollama-model qwen3.5:9b
MicTranscriber) captures audio from the default microphone and runs on-device speech-to-text. It emits:
on_line_text_changed) for the current phrase, shown on one line with \r.on_line_completed) when a phrase is done.TranscriptEventListener. On each finalized segment it:
No audio or transcript is sent to Moonshine servers; only the finalized text is sent to your local Ollama instance.