Binaspect -- A Python Library for Binaural Audio Analysis, Visualization & Feature Generation
Dan Barry, Davoud Shariat Panah, Alessandro Ragano, Jan Skoglund, Andrew Hines
TL;DR
Binaspect addresses the need for interpretable binaural analysis tools by proposing four interconnected representations: the bounded ILR spectrogram, ITD spectrogram, bounded ILR histogram, and ITD histogram. The approach enables blind, head-model-free observation of binaural cues, with degradations from rendering, compression, and down-mixing manifesting as broadened or shifted clusters in the histograms. The library provides both human-friendly histogram visualizations for inspection and exportable features suitable for machine learning workflows, supporting tasks like quality assessment and spatial localization. By making these representations open-source and reproducible, Binaspect offers a practical framework to diagnose binaural cue degradations and to guide the design of binaural rendering and processing pipelines. The work emphasizes interpretability, complements existing auditory modeling resources, and highlights future directions for expanded feature sets and multi-source handling.
Abstract
We present Binaspect, an open-source Python library for binaural audio analysis, visualization, and feature generation. Binaspect generates interpretable "azimuth maps" by calculating modified interaural time and level difference spectrograms, and clustering those time-frequency (TF) bins into stable time-azimuth histogram representations. This allows multiple active sources to appear as distinct azimuthal clusters, while degradations manifest as broadened, diffused, or shifted distributions. Crucially, Binaspect operates blindly on audio, requiring no prior knowledge of head models. These visualizations enable researchers and engineers to observe how binaural cues are degraded by codec and renderer design choices, among other downstream processes. We demonstrate the tool on bitrate ladders, ambisonic rendering, and VBAP source positioning, where degradations are clearly revealed. In addition to their diagnostic value, the proposed representations can be exported as structured features suitable for training machine learning models in quality prediction, spatial audio classification, and other binaural tasks. Binaspect is released under an open-source license with full reproducibility scripts at https://github.com/QxLabIreland/Binaspect.
