The key feature of M4S is no doubt the magic limiter which can automagically adjust your track to a specified target loudness. It does so by calculating the overall loudness of the entire track and calculates the needed gain reduction. This implies that the loudness is getting recalculated as soon as the user changes a parameter in one of the plugins. Since the calculation of tracks loudness is quite time consuming (multiple seconds for a three minute track on a typical computer), it cannot be done while playback - it has to be done in the background.

A simple architecture has been designed to accomplish this goal.

Basic Architecture

The AudioDeviceManager which requests audio from us for immediate playback triggers a routine, which fetches new audio samples from the audio source. These audio samples are then getting processed by a multiple plugins. This is done in sequential order (as shown). This is a pretty normal setup for your average audio processing app. DAWs combine this multi track mixing and more sophisticated audio signal routing. But for our use case - a simple mastering tool - this is more than enough.

Besides the online plugin chain is a synchronised offline chain, which is getting used for offline processing. One must be careful not to use the same instances as for the online version, since both might and will run at the very same time with widley different samples, sample rates and buffer sizes.

The main work of the offline processing however happens in the optimisers of the plugins. These receive the complete processed audio and are able to analyse the audio and set parameters of their online counterpart. To make sure that offline is working on the same audio as the online thread is producing a simple algorithm to synchronise the plugin’s state has been developed.

But to explain how this exactly works we first have to take a look at the framework used for M4S.

JUCE

JUCE by ROLI is a great framework which provides almost everything out of the box to build advanced audio applications. JUCE provides a simple interface to build and integrate audio plugins into an app. This includes handling of parameters of a given plugin. As this plugin interface is also an abstraction layer to generate VST and AudioUnit plugins it provides methods to save and restore plugin state. Exactly this feature is getting used to synchronize the state between the online and offline instance of our plugins.

Synchronisation

Let’s look at some simplified code on how this is done:

void synchronise(AudioProcessor& online, AudioProcessor& offline)
{
  MemoryBlock state;
  online.getStateInformation(state);
  offline.setStateInformation(offline);
}

This is done in the offline thread itself, so it does not cost us valuable processing time in the online thread. But since the online version is currently being used in the online thread and maybe the GUI/Message Thread as well we have to make sure, that the implementation of getStateInformation is thread safe.

The offline instances are inherting all methods from their online counterpart and only add a couple methods specific to the offline optimisations - in fact a simple loudness matcher plugin is as short as this:

class LoudnessMatcherOnline : virtual public juce::AudioProcessor
{
public:
  // a bunch of methods stripped out
  
  void processBlock(AudioSampleBuffer buffer)
  {
    buffer.applyGain(gain);
  }
private:
  std::atomic<double> gain;
};

class LoudnessMatcherOffline : public LoudnessMatcherOnline : virtual public OfflineAudioProcessor
{
public:
  bool optimise(const AudioSampleBuffer& buffer) override
  {
    calculatedLoudness = calculateLoudnessSomehow(buffer);
    return true;
  }
  
  void update(AudioProcessor* live_) override
  {
    if (auto live = dynamic_cast<LoudnessMatcherOnline*>(live_))
      live.gain = loudnessToGain(calculatedLoudness);
  }
private:
  float calculatedLoudness;
};

The actual offline calculation is done in steps, to provide each plugin the opportunity to optimise parameters.

// get the underlying audio data
const AudioSampleBuffer audioData = getAudioData();

// clone the valid data into the active buffer, used in the future iterations
AudioSampleBuffer activeBuffer = clone(audioData);

// iterate over all plugins, plugins is a list of pairs which are storing the online and offline instances.
for (auto&& plugin : plugins)
{
  auto online = plugin.onlineInstance;
  auto offline = plugin.offlineInstance;
  
  synchronise(online, offline);
  
  // we loop here to provide each plugin unlimited number of attempts to optimise their state
  // in the actual application this for loop has an upper limit.
  for (;;)
  {
    // get a writable buffer to work on in this iteration
    auto workingBuffer = clone(activeBuffer);
    
    // prepare the plugin
    offline.prepareToPlay(sampleRate, samplesPerBlock);
    // process the samples
    offline.processBlock(activeBuffer);
    // perform the optimisation
    bool hasFinishedOptimising = offline.optimise(activeBuffer);
    
    if (hasFinishedOptimising)
    {
      // if we have finished the current working buffer will be the input of the next plugin
      activeBuffer = workingBuffer;
      break;
    }
  }
  
  // since we have optimised all parameters we can update the online version now.
  // up until the online version has only been "touched" to get the state of the user settable parameters.
  offline.update(&online);
}

The most complicated part of this routine is the handling of the different buffers. Since processBlock of the audio processors is muting the given buffer, we have clone the last valid version of the buffer and pass a copy to the processBlock. This allows the the plugins to optimise its state iteratively and re-run itself as long as needed.

Future Improvement Potential

The offline thread is processing the entire chain as long as the track is getting played back. It gets retriggered every second (if it is not currently running). Right now it throws every calculated buffer away after each run.

In the future this could get improved by listening for audioProcessorChanged from the online version of the audio processors and caching the resulting buffer of processors which have not changed since the last iterations in the chain. This would greatly improve performance in the case of plugin chain of length 5 and the user is only tweaking the last plugin.