Back to Label Studio

Speaker Diarization

docs/source/templates/speaker_segmentation.md

2.2.10-11.9 KB
Original Source

When preparing audio transcripts or training a machine learning model to differentiate between different speakers, use this template to perform speaker segmentation and label different regions of an audio clip with different speakers.

!!! error Enterprise If you're managing more complex or high-volume audio labeling projects, Label Studio Enterprise includes an advanced audio transcription interface built to support faster, more precise annotation at scale.

See our new [Multi-Channel Audio Transcription](react_audio) template and learn more in [A New Audio Transcription UI for Speed and Quality at Scale](https://humansignal.com/blog/building-a-better-ui-for-audio-transcription-at-scale/) (blog post).

Interactive Template Preview

<div id="main-preview"></div>

Labeling Configuration

html
<View>
  <Labels name="label" toName="audio" zoom="true" hotkey="ctrl+enter">
    <Label value="Speaker one" background="#00FF00"/>
    <Label value="Speaker two" background="#12ad59"/>
  </Labels>
  <Audio name="audio" value="$audio" />
</View>

About the labeling configuration

All labeling configurations must be wrapped in View tags.

Use the Labels control tag to allow annotators to highlight specific regions of the audio clip and apply a label:

xml
<Labels name="label" toName="audio" zoom="true" hotkey="ctrl+enter">
    <Label value="Speaker one" background="#00FF00"/>
    <Label value="Speaker two" background="#12ad59"/>
</Labels>

Use the Audio object tag to display a waveform of audio and allow annotators to change the speed of the audio playback:

xml
<Audio name="audio" value="$audio" />