Feasibility of iMagLS-BSM -- ILD Informed Binaural Signal Matching with Arbitrary Microphone Arrays
Or Berebi, Zamir Ben-Hur, David Lou Alon, Boaz Rafaely
TL;DR
This work addresses the gap in spatial quality when rendering binaural audio from wearable, arbitrary microphone arrays. It extends the MagLS-based BSM framework by introducing an ILD error term, yielding the iMagLS-BSM objective $\mathbb{E}[\tfrac{1}{2}(\epsilon^{l}_{mls} + \epsilon^{r}_{mls}) + \lambda \epsilon_{ILD}]$ and optimizing with a BFGS approach. Objective results show a notable ILD reduction (≈3.8 dB on average) with preserved magnitude accuracy in the 1.5–20 kHz range, especially for frontal directions, suggesting improved spatial cues under wearable constraints. The method holds promise for enhancing AR/VR binaural reproduction but requires broader perceptual validation across varied HRTF sets and array configurations.
Abstract
Binaural reproduction for headphone-centric listening has become a focal point in ongoing research, particularly within the realm of advancing technologies such as augmented and virtual reality (AR and VR). The demand for high-quality spatial audio in these applications is essential to uphold a seamless sense of immersion. However, challenges arise from wearable recording devices equipped with only a limited number of microphones and irregular microphone placements due to design constraints. These factors contribute to limited reproduction quality compared to reference signals captured by high-order microphone arrays. This paper introduces a novel optimization loss tailored for a beamforming-based, signal-independent binaural reproduction scheme. This method, named iMagLS-BSM incorporates an interaural level difference (ILD) error term into the previously proposed binaural signal matching (BSM) magnitude least squares (MagLS) rendering loss for lateral plane angles. The method leverages nonlinear programming to minimize the introduced loss. Preliminary results show a substantial reduction in ILD error, while maintaining a binaural magnitude error comparable to that achieved with a MagLS BSM solution. These findings hold promise for enhancing the overall spatial quality of resultant binaural signals.
