WebAudio

The WebAudio API is a high-level JavaScript API for processing and synthesizing audio in web applications.

WebAudio uses a Modular Routing based model for processing audio, that allows connecting multiple different AudioNode inputs and/or outputs and make the audio flow through these connected nodes. In this model, nodes can be sources (no inputs and a single output), destinations (one input and no output) or filters (one or multiple inputs and outputs). The simplest example is a single source routed directly to the output.

In WebAudio everything happens within an AudioContext instance that manages and plays all sounds to its single AudioDestinationNode. The audio can be rendered to hardware or to a buffer, for example, through OfflineAudioContext.

The servo-media Rust API for WebAudio is deliverately close to the actual WebAudio JavaScript API.

rust

/*
  This is an example of a very basic WebAudio pipeline with an OscillatorNode connected to a GainNode.
  ------------------------------------------------------------
  |  AudioContext                                            |
  |      OscillatorNode -> GainNode -> AudioDestinationNode  |
  ------------------------------------------------------------
  NOTE: Some boilerplate has been removed for simplicity.
  Please, visit the examples folder for a more complete version.
*/

// Context creation.
let context =
  servo_media.create_audio_context(&ClientContextId::build(1, 1), Default::default());

// Create and configure nodes.
let osc = context.create_node(
  AudioNodeInit::OscillatorNode(Default::default()),
  Default::default(),
).expect("Failed to create oscillator node");
let mut options = GainNodeOptions::default();
options.gain = 0.5;
let gain = context.create_node(AudioNodeInit::GainNode(options), Default::default())
  .expect("Failed to create gain node");

// Connect nodes.
let dest = context.dest_node();
context.connect_ports(osc.output(0), gain.input(0));
context.connect_ports(gain.output(0), dest.input(0));

// Start playing.
context.message_node(
  osc,
  AudioNodeMessage::AudioScheduledSourceNode(AudioScheduledSourceNodeMessage::Start(0.)),
);

Implementation

Threading model

Following the WebAudio API specification, servo-media implements the concepts of control thread and rendering thread.

The control thread is the thread from which the AudioContext is instantiated and from which authors manipulate the audio graph. In Servo's case, this is the script thread.

The rendering thread is where the magic and the actual audio processing happens. This thread keeps an event loop that listens and handles control messages coming from the control thread and the audio backend and processes the audio coming from the audio graph in blocks of 128 samples-frames called render quantums. For each spin of the loop, the AudioRenderThread.process method is called. This method internally runs a DFS traversal on the internal graph calling the process method for each node. The resulting chunk of audio data is pushed to the audio sink.

Audio Playback

WebAudio allows clients to render the processed audio to hardware or to an audio buffer. To abstract this behavior, servo-media exposes the AudioSink trait.

For offline rendering to an audio buffer, there is an OfflineAudioSink implementation of the AudioSink trait.

For hardware rendering, the AudioSink trait is expected to be implemented by the media backends. For GStreamer the implementation creates a simple audio pipeline as the following:

The core piece of the audio sink is the initial appsrc element, that we use to insert the audio chunks into the GStreamer pipeline. We use the appsrc in push mode, where we repeatedly call the push-buffer method with a new buffer. To make audio playback nice and smooth and to continue processing control events coming from the control thread, we cannot use the appsrc's block property, that would essentially block the render thread when we fill the appsrc internal queue. Instead, we set the maximum amount of bytes that can be queued in the appsrc to 1 and use a combination of get_current_level_bytes and the need-data signal to decide whether we should push the new audio buffer or not.

Audio Decoding

WebAudio also supports decoding audio data.

servo-media exposes a very simple AudioDecoder trait, that exposes a single decode method taking among other arguments the audio data to be decoded and an instance of AudioDecoderCallbacks containing the callbacks for the end-of-stream, error and progress events that are triggered during the decoding process.

servo-media backends are required to implement this trait.

The GStreamer backend implementation of the AudioDecoder trait creates a pipeline that has this form:

The decodebin element is the core element of this pipeline and it auto-magically constructs a decoding pipeline using available decoders and demuxers via auto-plugging.