Table of Contents
Fetching ...

Automatic Equalization for Individual Instrument Tracks Using Convolutional Neural Networks

Florian Mockenhaupt, Joscha Simon Rieber, Shahan Nercessian

TL;DR

This work tackles automatic equalization of individual instrument tracks by predicting parametric-EQ settings that align a track’s spectrum with instrument-specific targets. It introduces an instrument-classification module to select appropriate target spectra, computes a spectral difference, and uses a neural parametric-EQ matcher trained with a two-stage scheme that incorporates real-world data, improving generalization. The enhanced CNN-based matching model, combined with real-world fine-tuning, achieves a notable 24% reduction in mean absolute error over prior approaches and yields subjectively perceptible tonal improvements in listening tests. The proposed system operates without reference audio at inference, enabling robust, interpretable automatic EQ suitable for practical music production workflows, and sets the stage for extending to additional processing tools.

Abstract

We propose a novel approach for the automatic equalization of individual musical instrument tracks. Our method begins by identifying the instrument present within a source recording in order to choose its corresponding ideal spectrum as a target. Next, the spectral difference between the recording and the target is calculated, and accordingly, an equalizer matching model is used to predict settings for a parametric equalizer. To this end, we build upon a differentiable parametric equalizer matching neural network, demonstrating improvements relative to previously established state-of-the-art. Unlike past approaches, we show how our system naturally allows real-world audio data to be leveraged during the training of our matching model, effectively generating suitably produced training targets in an automated manner mirroring conditions at inference time. Consequently, we illustrate how fine-tuning our matching model on such examples considerably improves parametric equalizer matching performance in real-world scenarios, decreasing mean absolute error by 24% relative to methods relying solely on random parameter sampling techniques as a self-supervised learning strategy. We perform listening tests, and demonstrate that our proposed automatic equalization solution subjectively enhances the tonal characteristics for recordings of common instrument types.

Automatic Equalization for Individual Instrument Tracks Using Convolutional Neural Networks

TL;DR

This work tackles automatic equalization of individual instrument tracks by predicting parametric-EQ settings that align a track’s spectrum with instrument-specific targets. It introduces an instrument-classification module to select appropriate target spectra, computes a spectral difference, and uses a neural parametric-EQ matcher trained with a two-stage scheme that incorporates real-world data, improving generalization. The enhanced CNN-based matching model, combined with real-world fine-tuning, achieves a notable 24% reduction in mean absolute error over prior approaches and yields subjectively perceptible tonal improvements in listening tests. The proposed system operates without reference audio at inference, enabling robust, interpretable automatic EQ suitable for practical music production workflows, and sets the stage for extending to additional processing tools.

Abstract

We propose a novel approach for the automatic equalization of individual musical instrument tracks. Our method begins by identifying the instrument present within a source recording in order to choose its corresponding ideal spectrum as a target. Next, the spectral difference between the recording and the target is calculated, and accordingly, an equalizer matching model is used to predict settings for a parametric equalizer. To this end, we build upon a differentiable parametric equalizer matching neural network, demonstrating improvements relative to previously established state-of-the-art. Unlike past approaches, we show how our system naturally allows real-world audio data to be leveraged during the training of our matching model, effectively generating suitably produced training targets in an automated manner mirroring conditions at inference time. Consequently, we illustrate how fine-tuning our matching model on such examples considerably improves parametric equalizer matching performance in real-world scenarios, decreasing mean absolute error by 24% relative to methods relying solely on random parameter sampling techniques as a self-supervised learning strategy. We perform listening tests, and demonstrate that our proposed automatic equalization solution subjectively enhances the tonal characteristics for recordings of common instrument types.
Paper Structure (20 sections, 13 equations, 9 figures, 1 table)

This paper contains 20 sections, 13 equations, 9 figures, 1 table.

Figures (9)

  • Figure 1: Proposed automatic EQ system.
  • Figure 2: Examples of instrument-specific target spectra derived from our training set.
  • Figure 3: Spectral difference computation for a snare drum input signal with its corresponding ideal instrument target spectrum.
  • Figure 4: Model architectures for parametric EQ matching: (a) baseline MLP model as in nercessian2020 and (b) our enhanced CNN model with added convolutional front end.
  • Figure 5: Base training and fine-tuning stages of neural parametric EQ matching models.
  • ...and 4 more figures