Table of Contents
Fetching ...

Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models

Alec Wright, Alistair Carson, Lauri Juvela

TL;DR

The experiments show that using Open-Amp to train a guitar effects encoder achieves new state-of-the-art results on multiple guitar effects classification tasks, and a one-to-many guitar effects model is trained, and used to emulate unseen analog effects via manipulation of its learned latent space, indicating transferability to analog guitar effects data.

Abstract

This paper introduces Open-Amp, a synthetic data framework for generating large-scale and diverse audio effects data. Audio effects are relevant to many musical audio processing and Music Information Retrieval (MIR) tasks, such as modelling of analog audio effects, automatic mixing, tone matching and transcription. Existing audio effects datasets are limited in scope, usually including relatively few audio effects processors and a limited amount of input audio signals. Our proposed framework overcomes these issues, by crowdsourcing neural network emulations of guitar amplifiers and effects, created by users of open-source audio effects emulation software. This allows users of Open-Amp complete control over the input signals to be processed by the effects models, as well as providing high-quality emulations of hundreds of devices. Open-Amp can render audio online during training, allowing great flexibility in data augmentation. Our experiments show that using Open-Amp to train a guitar effects encoder achieves new state-of-the-art results on multiple guitar effects classification tasks. Furthermore, we train a one-to-many guitar effects model using Open-Amp, and use it to emulate unseen analog effects via manipulation of its learned latent space, indicating transferability to analog guitar effects data.

Open-Amp: Synthetic Data Framework for Audio Effect Foundation Models

TL;DR

The experiments show that using Open-Amp to train a guitar effects encoder achieves new state-of-the-art results on multiple guitar effects classification tasks, and a one-to-many guitar effects model is trained, and used to emulate unseen analog effects via manipulation of its learned latent space, indicating transferability to analog guitar effects data.

Abstract

This paper introduces Open-Amp, a synthetic data framework for generating large-scale and diverse audio effects data. Audio effects are relevant to many musical audio processing and Music Information Retrieval (MIR) tasks, such as modelling of analog audio effects, automatic mixing, tone matching and transcription. Existing audio effects datasets are limited in scope, usually including relatively few audio effects processors and a limited amount of input audio signals. Our proposed framework overcomes these issues, by crowdsourcing neural network emulations of guitar amplifiers and effects, created by users of open-source audio effects emulation software. This allows users of Open-Amp complete control over the input signals to be processed by the effects models, as well as providing high-quality emulations of hundreds of devices. Open-Amp can render audio online during training, allowing great flexibility in data augmentation. Our experiments show that using Open-Amp to train a guitar effects encoder achieves new state-of-the-art results on multiple guitar effects classification tasks. Furthermore, we train a one-to-many guitar effects model using Open-Amp, and use it to emulate unseen analog effects via manipulation of its learned latent space, indicating transferability to analog guitar effects data.

Paper Structure

This paper contains 10 sections, 2 figures, 3 tables.

Figures (2)

  • Figure 1: TCN layer for the one-to-many guitar effect emulation task. ${\bf h}_l \in \mathbb{Re}^{N \times C \times T}$ is the output from the $l$th layer where $N$ is the batch size, $C$ is the channel width and $T$ is the time-dimension. $M$ is the number of devices used during training and the embedding dimension is $E=5$ here for illustrative purposes. This example shows a forward pass conditioned on the 1st device embedding. The learnable embeddings are shared amongst all layers.
  • Figure 2: Enrolment experiment results for unseen devices from the EGFX dataset: a) Proco Rat b) Boss Blues Driver and c) Tube Screamer Mini. The grey line shows the test loss of a device-specific one-to-one TCN model.