Table of Contents
Fetching ...

Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling

Yicheng Gu, Runsong Zhang, Lauri Juvela, Zhizheng Wu

TL;DR

Solid State Bus-Comp addresses the lack of large-scale, diverse data for neural VA modeling of dynamic range compressors by introducing a 2528-hour dataset built from 175 unmastered songs processed with 220 hardware-parameter settings. The authors document dataset construction, annotation, and extensive benchmarking across black-box, grey-box, and white-box approaches, plus ablation studies on data quantity and diversity. Key findings show that increasing data quantity and diversity improves generalization, and time-varying FiLM conditioning substantially enhances model performance, although commercial plugins still lag behind state-of-the-art neural methods. The resource provides a public project page with demos, enabling reproducible research and accelerating progress in VA compressor modeling.

Abstract

Virtual Analog (VA) modeling aims to simulate the behavior of hardware circuits via algorithms to replicate their tone digitally. Dynamic Range Compressor (DRC) is an audio processing module that controls the dynamics of a track by reducing and amplifying the volumes of loud and quiet sounds, which is essential in music production. In recent years, neural-network-based VA modeling has shown great potential in producing high-fidelity models. However, due to the lack of data quantity and diversity, their generalization ability in different parameter settings and input sounds is still limited. To tackle this problem, we present Solid State Bus-Comp, the first large-scale and diverse dataset for modeling the classical VCA compressor -- SSL 500 G-Bus. Specifically, we manually collected 175 unmastered songs from the Cambridge Multitrack Library. We recorded the compressed audio in 220 parameter combinations, resulting in an extensive 2528-hour dataset with diverse genres, instruments, tempos, and keys. Moreover, to facilitate the use of our proposed dataset, we conducted benchmark experiments in various open-sourced black-box and grey-box models, as well as white-box plugins. We also conducted ablation studies in different data subsets to illustrate the effectiveness of the improved data diversity and quantity. The dataset and demos are on our project page: https://www.yichenggu.com/SolidStateBusComp/.

Solid State Bus-Comp: A Large-Scale and Diverse Dataset for Dynamic Range Compressor Virtual Analog Modeling

TL;DR

Solid State Bus-Comp addresses the lack of large-scale, diverse data for neural VA modeling of dynamic range compressors by introducing a 2528-hour dataset built from 175 unmastered songs processed with 220 hardware-parameter settings. The authors document dataset construction, annotation, and extensive benchmarking across black-box, grey-box, and white-box approaches, plus ablation studies on data quantity and diversity. Key findings show that increasing data quantity and diversity improves generalization, and time-varying FiLM conditioning substantially enhances model performance, although commercial plugins still lag behind state-of-the-art neural methods. The resource provides a public project page with demos, enabling reproducible research and accelerating progress in VA compressor modeling.

Abstract

Virtual Analog (VA) modeling aims to simulate the behavior of hardware circuits via algorithms to replicate their tone digitally. Dynamic Range Compressor (DRC) is an audio processing module that controls the dynamics of a track by reducing and amplifying the volumes of loud and quiet sounds, which is essential in music production. In recent years, neural-network-based VA modeling has shown great potential in producing high-fidelity models. However, due to the lack of data quantity and diversity, their generalization ability in different parameter settings and input sounds is still limited. To tackle this problem, we present Solid State Bus-Comp, the first large-scale and diverse dataset for modeling the classical VCA compressor -- SSL 500 G-Bus. Specifically, we manually collected 175 unmastered songs from the Cambridge Multitrack Library. We recorded the compressed audio in 220 parameter combinations, resulting in an extensive 2528-hour dataset with diverse genres, instruments, tempos, and keys. Moreover, to facilitate the use of our proposed dataset, we conducted benchmark experiments in various open-sourced black-box and grey-box models, as well as white-box plugins. We also conducted ablation studies in different data subsets to illustrate the effectiveness of the improved data diversity and quantity. The dataset and demos are on our project page: https://www.yichenggu.com/SolidStateBusComp/.

Paper Structure

This paper contains 13 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Duration statistics (hours) of the unmastered songs used as input signals in Solid State Bus-Comp by genres and instruments.
  • Figure 2: The annotation pipeline of Solid State Bus-Comp. We utilized various pre-trained models to obtain information on each song's key, tempo, genre, and instrument.
  • Figure 3: Tempo and Key statistics (occurrences) of the unmastered songs used as input signals in our proposed Solid State Bus-Comp. Tempo is in beats per minute (BPM). "M" denotes for "Major" and "m" denotes for "Minor".
  • Figure 4: Comparison of acoustic and semantic diversities in input signals between Solid State Bus-Comp and the existing datasets. The plottings are obtained by applying the PCA algorithm to the SSL representations. We used MERT to extract acoustic embeddings and w2v-BERT 2.0 to extract semantic embeddings. For existing datasets, the compact cluster represents random noises, the diffused cluster represents test signals (sine, square, triangle waves, etc), and the remaining scattered points represent real-world recordings.