docs/changelog/2023-11-19-tts-stt.mdx
LobeHub now supports Text-to-Speech (TTS) and Speech-to-Text (STT), turning typed conversations into natural voice interactions. You can speak with your Agents and hear their responses, making the experience closer to talking with a real person.
With TTS, your Agents can read responses aloud in clear, natural-sounding voices. With STT, you can dictate messages instead of typing. Together, they enable hands-free interaction—useful when you're multitasking, on the move, or simply prefer speaking to typing.
This is especially helpful for:
Different Agents can have different voices. Choose a voice that matches each Agent's personality or purpose. A professional assistant might use a calm, measured tone. A creative collaborator might sound more expressive.
We've curated high-quality voices from OpenAI Audio and Microsoft Edge Speech to serve users across regions and preferences. Select the voice that fits your usage style or scenario.
Voice support closes the gap between human and AI interaction styles. Speak naturally, hear responses aloud, and maintain context just like you would in a spoken conversation. The rest of LobeHub's features—plugins, multimodal support, context management—work seamlessly alongside voice mode.