Differentiable Black-box and Gray-box Modeling of Nonlinear Audio Effects
Marco Comunità, Christian J. Steinmetz, Joshua D. Reiss
TL;DR
This work benchmarks differentiable black-box and gray-box approaches for modeling nonlinear audio effects across a wide range of devices. It introduces time-varying gray-box models and a large ToneTwist AFx dataset to enable broad, fair comparisons, complemented by extensive objective and subjective evaluations. The results identify state-space/S4-based backbones, especially with time-varying conditioning, as the most reliable across effect types, while highlighting challenges in gray-box setups and the need for better evaluation metrics. The findings have practical implications for universal, efficient neural emulation of analog audio gear and guide future directions in differentiable audio DSP.
Abstract
Audio effects are extensively used at every stage of audio and music content creation. The majority of differentiable audio effects modeling approaches fall into the black-box or gray-box paradigms; and most models have been proposed and applied to nonlinear effects like guitar amplifiers, overdrive, distortion, fuzz and compressor. Although a plethora of architectures have been introduced for the task at hand there is still lack of understanding on the state of the art, since most publications experiment with one type of nonlinear audio effect and a very small number of devices. In this work we aim to shed light on the audio effects modeling landscape by comparing black-box and gray-box architectures on a large number of nonlinear audio effects, identifying the most suitable for a wide range of devices. In the process, we also: introduce time-varying gray-box models and propose models for compressor, distortion and fuzz, publish a large dataset for audio effects research - ToneTwist AFx https://github.com/mcomunita/tonetwist-afx-dataset - that is also the first open to community contributions, evaluate models on a variety of metrics and conduct extensive subjective evaluation. Code https://github.com/mcomunita/nablafx and supplementary material https://github.com/mcomunita/nnlinafx-supp-material are also available.
