Table of Contents
Fetching ...

Searching For Music Mixing Graphs: A Pruning Approach

Sungho Lee, Marco A. Martínez-Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Giorgio Fabbro, Kyogu Lee, Yuki Mitsufuji

TL;DR

The paper tackles the problem of reverse engineering a music mixing graph from dry tracks and a final mix. It introduces a differentiable mixing console with seven processors per track and trains it end-to-end to match the target mix, then employs iterative pruning to obtain a sparse, interpretable graph $G_\mathrm{p}$ that preserves match quality within a tolerance $\tau$. By testing on datasets MedleyDB, MixingSecrets, and Internal, the authors demonstrate that pruning reduces the average processor count by about 69% while keeping $L_\mathrm{a}$ within a small margin of the full console, and generate a large corpus of graph-audio pairs suitable for training neural networks in music mixing applications. The work provides a practical reverse-engineering pipeline, analyzes different pruning strategies, and discusses how the resulting graphs can inform perceptually faithful mixing and data-driven mixing models. The approach advances the understanding of mixing as a graph-structured, differentiable process and offers a scalable path to assemble and utilize mixing graphs from real-world audio.

Abstract

Music mixing is compositional -- experts combine multiple audio processors to achieve a cohesive mix from dry source tracks. We propose a method to reverse engineer this process from the input and output audio. First, we create a mixing console that applies all available processors to every chain. Then, after the initial console parameter optimization, we alternate between removing redundant processors and fine-tuning. We achieve this through differentiable implementation of both processors and pruning. Consequently, we find a sparse mixing graph that achieves nearly identical matching quality of the full mixing console. We apply this procedure to dry-mix pairs from various datasets and collect graphs that also can be used to train neural networks for music mixing applications.

Searching For Music Mixing Graphs: A Pruning Approach

TL;DR

The paper tackles the problem of reverse engineering a music mixing graph from dry tracks and a final mix. It introduces a differentiable mixing console with seven processors per track and trains it end-to-end to match the target mix, then employs iterative pruning to obtain a sparse, interpretable graph that preserves match quality within a tolerance . By testing on datasets MedleyDB, MixingSecrets, and Internal, the authors demonstrate that pruning reduces the average processor count by about 69% while keeping within a small margin of the full console, and generate a large corpus of graph-audio pairs suitable for training neural networks in music mixing applications. The work provides a practical reverse-engineering pipeline, analyzes different pruning strategies, and discusses how the resulting graphs can inform perceptually faithful mixing and data-driven mixing models. The approach advances the understanding of mixing as a graph-structured, differentiable process and offers a scalable path to assemble and utilize mixing graphs from real-world audio.

Abstract

Music mixing is compositional -- experts combine multiple audio processors to achieve a cohesive mix from dry source tracks. We propose a method to reverse engineer this process from the input and output audio. First, we create a mixing console that applies all available processors to every chain. Then, after the initial console parameter optimization, we alternate between removing redundant processors and fine-tuning. We achieve this through differentiable implementation of both processors and pruning. Consequently, we find a sparse mixing graph that achieves nearly identical matching quality of the full mixing console. We apply this procedure to dry-mix pairs from various datasets and collect graphs that also can be used to train neural networks for music mixing applications.
Paper Structure (33 sections, 9 equations, 16 figures, 6 tables, 2 algorithms)

This paper contains 33 sections, 9 equations, 16 figures, 6 tables, 2 algorithms.

Figures (16)

  • Figure 1: Music mixing graph search via iterative pruning.
  • Figure 2: Finding the sparse graph $G_\mathrm{p}$ from the differentiable mixing console $G_\mathrm{c}$. Initial letters in the nodes denote their respective types. i: input, o: output, m: mix, e: equalizer, c: compressor, n: noisegate, s: stereo imager, g: gain/panning, r: reverb, d: multitap delay.
  • Figure 3: Process of iterative pruning (hybrid, $\tau=0.01$). $24$ songs ($8$ songs per dataset) are shown; each color represents an individual song. The upper and lower rows show the pruning ratios and mean dry/wet weights. The yellow-shaded regions show the pruning phase.
  • Figure 4: Each node's weight and loss increase when pruned.
  • Figure 5: Pruning results (hybrid method) with various tolerances. Song: TablaBreakbeatScience_RockSteady.
  • ...and 11 more figures