internal/wyoming/README.md
[!NOTE] The format is under development and does not yet work stably.
This module provide Wyoming Protocol support to create local voice assistants using Home Assistant.
You can use a large number of different projects for WAKE, STT, INTENT and TTS thanks to the Home Assistant.
And you can use a large number of different technologies for MIC and SND thanks to Go2rtc.
You can optionally specify WAKE service. So go2rtc will start transmitting audio to Home Assistant only after WAKE word. If the WAKE service cannot be connected to or not specified - go2rtc will pass all audio to Home Assistant. In this case WAKE service must be configured in your Voice Assistant pipeline.
You can optionally specify VAD threshold. So go2rtc will start transmitting audio to WAKE service only after some audio noise.
Your stream must support audio transmission in PCM codec (include PCMA/PCMU).
wyoming:
stream_name_from_streams_section:
listen: :10700
name: "My Satellite" # optional name
wake_uri: tcp://192.168.1.23:10400 # optional WAKE service
vad_threshold: 1 # optional VAD threshold (from 0.1 to 3.5)
Home Assistant -> Settings -> Integrations -> Add -> Wyoming Protocol -> Host + Port from go2rtc.yaml
Select one or multiple wake words:
wake_uri: tcp://192.168.1.23:10400?name=alexa_v0.1&name=hey_jarvis_v0.1&name=hey_mycroft_v0.1&name=hey_rhasspy_v0.1&name=ok_nabu_v0.1
You can add wyoming event handling using the expr language. For example, to pronounce TTS on some media player from HA.
Turn on the logs to see what kind of events happens.
This is what the default scripts look like:
wyoming:
script_example:
event:
run-satellite: Detect()
pause-satellite: Stop()
voice-stopped: Pause()
audio-stop: PlayAudio() && WriteEvent("played") && Detect()
error: Detect()
internal-run: WriteEvent("run-pipeline", '{"start_stage":"wake","end_stage":"tts"}') && Stream()
internal-detection: WriteEvent("run-pipeline", '{"start_stage":"asr","end_stage":"tts"}') && Stream()
Supported functions and variables:
Detect() - start the VAD and WAKE word detection processStream() - start transmission of audio data to the client (Home Assistant)Stop() - stop and disconnect stream without disconnecting client (Home Assistant)Pause() - temporary pause of audio transfer, without disconnecting the streamPlayAudio() - playing the last audio that was sent from client (Home Assistant)WriteEvent(type, data) - send event to client (Home Assistant)Sleep(duration) - temporary script pause (ex. Sleep('1.5s'))PlayFile(path) - play audio from wav fileType - type (name) of eventData - event data in JSON format (ex. {"text":"how are you"})fetch)If you write a script for an event - the default action is no longer executed. You need to repeat the necessary steps yourself.
In addition to the standard events, there are two additional events:
internal-run - called after Detect() when VAD detected, but WAKE service unavailableinternal-detection - called after Detect() when WAKE word detectedExample 1. You want to play a sound file when a wake word detected (only wav supported):
PlayFile and PlayAudio functions are executed synchronously, the following steps will be executed only after they are completedwyoming:
script_example:
event:
internal-detection: PlayFile('/media/beep.wav') && WriteEvent("run-pipeline", '{"start_stage":"asr","end_stage":"tts"}') && Stream()
Example 2. You want to play TTS on a Home Assistant media player:
Each event has a Type and Data in JSON format. You can use their values in scripts.
synthesize step, we get the value of the text and call the HA REST APIaudio-stop step we get the duration of the TTS in seconds, wait for this time and start the pipeline againwyoming:
script_example:
event:
synthesize: |
let text = fromJSON(Data).text;
let token = 'eyJhbGci...';
fetch('http://localhost:8123/api/services/tts/speak', {
method: 'POST',
headers: {'Authorization': 'Bearer '+token,'Content-Type': 'application/json'},
body: toJSON({
entity_id: 'tts.google_translate_com',
media_player_entity_id: 'media_player.google_nest',
message: text,
language: 'en',
}),
}).ok
audio-stop: |
let timestamp = fromJSON(Data).timestamp;
let delay = string(timestamp)+'s';
Sleep(delay) && WriteEvent("played") && Detect()
Satellite on Windows server using FFmpeg and FFplay.
streams:
satellite_win:
- exec:ffmpeg -hide_banner -f dshow -i "audio=Microphone (High Definition Audio Device)" -c pcm_s16le -ar 16000 -ac 1 -f wav -
- exec:ffplay -hide_banner -nodisp -probesize 32 -f s16le -ar 22050 -#backchannel=1#audio=s16le/22050
wyoming:
satellite_win:
listen: :10700
name: "Windows Satellite"
wake_uri: tcp://192.168.1.23:10400
vad_threshold: 1
Satellite on Dahua camera with two-way audio support.
streams:
dahua_camera:
- rtsp://admin:[email protected]/cam/realmonitor?channel=1&subtype=1&unicast=true&proto=Onvif
wyoming:
dahua_camera:
listen: :10700
name: "Dahua Satellite"
wake_uri: tcp://192.168.1.23:10400
vad_threshold: 1
Satellite on external wyoming Microphone and Sound.
streams:
wyoming_external:
- wyoming://192.168.1.23:10600 # wyoming-mic-external
- wyoming://192.168.1.23:10601?backchannel=1 # wyoming-snd-external
wyoming:
wyoming_external:
listen: :10700
name: "Wyoming Satellite"
wake_uri: tcp://192.168.1.23:10400
vad_threshold: 1
Advanced users, who want to enjoy the Wyoming Satellite project, can use go2rtc as a Wyoming External Microphone or Wyoming External Sound.
go2rtc.yaml
streams:
wyoming_mic_external:
- exec:ffmpeg -hide_banner -f dshow -i "audio=Microphone (High Definition Audio Device)" -c pcm_s16le -ar 16000 -ac 1 -f wav -
wyoming_snd_external:
- exec:ffplay -hide_banner -nodisp -probesize 32 -f s16le -ar 22050 -#backchannel=1#audio=s16le/22050
wyoming:
wyoming_mic_external:
listen: :10600
mode: mic
wyoming_snd_external:
listen: :10601
mode: snd
docker-compose.yml
version: "3.8"
services:
satellite:
build: wyoming-satellite # https://github.com/rhasspy/wyoming-satellite
ports:
- "10700:10700"
command:
- "--name"
- "my satellite"
- "--mic-uri"
- "tcp://192.168.1.23:10600"
- "--snd-uri"
- "tcp://192.168.1.23:10601"
- "--debug"
go2rtc.yaml
streams:
wyoming_external:
- wyoming://192.168.1.23:10600
- wyoming://192.168.1.23:10601?backchannel=1
docker-compose.yml
version: "3.8"
services:
microphone:
build: wyoming-mic-external # https://github.com/rhasspy/wyoming-mic-external
ports:
- "10600:10600"
devices:
- /dev/snd:/dev/snd
group_add:
- audio
command:
- "--device"
- "sysdefault"
- "--debug"
playback:
build: wyoming-snd-external # https://github.com/rhasspy/wyoming-snd-external
ports:
- "10601:10601"
devices:
- /dev/snd:/dev/snd
group_add:
- audio
command:
- "--device"
- "sysdefault"
- "--debug"
log:
wyoming: trace