Fully Reversing the Shoebox Image Source Method: From Impulse Responses to Room Parameters
Tom Sprunck, Antoine Deleforge, Yannick Privat, Cédric Foy
TL;DR
This work investigates the reversibility of the shoebox image source method (ISM) for room impulse responses. It introduces an open-source, three-stage algorithm that recovers 18 forward parameters from a discrete, low-passed multichannel RIR: source position, room dimensions, the 6-DOF room pose, and wall absorption coefficients. The method combines gridless image-source localization with a room-axes recovery and first-order image-source labeling to infer the complete geometry; extensive simulations show near-exact recovery for sizable spherical microphone arrays (e.g., 32 channels at 16 kHz), with errors decaying as array size and sampling rate increase. Compared to a Dokmanic EDM baseline, the proposed approach yields substantially more accurate geometry and enables reliable RIR extrapolation, demonstrating, for the first time to our knowledge, that this classical forward model is invertible over a wide range of configurations. Real-data applicability remains an avenue for future work, requiring extensions to account for angular/frequency dependencies and potential occlusions.
Abstract
We present an algorithm that fully reverses the shoebox image source method (ISM), a popular and widely used room impulse response (RIR) simulator for cuboid rooms introduced by Allen and Berkley in 1979. More precisely, given a discrete multichannel RIR generated by the shoebox ISM for a microphone array of known geometry, the algorithm reliably recovers the 18 input parameters. These are the 3D source position, the 3 dimensions of the room, the 6-degrees-of-freedom room translation and orientation, and an absorption coefficient for each of the 6 room boundaries. The approach builds on a recently proposed gridless image source localization technique combined with new procedures for room axes recovery and first-order-reflection identification. Extensive simulated experiments reveal that near-exact recovery of all parameters is achieved for a 32-element, 8.4-cm-wide spherical microphone array and a sampling rate of 16~kHz using fully randomized input parameters within rooms of size 2X2X2 to 10X10X5 meters. Estimation errors decay towards zero when increasing the array size and sampling rate. The method is also shown to strongly outperform a known baseline, and its ability to extrapolate RIRs at new positions is demonstrated. Crucially, the approach is strictly limited to low-passed discrete RIRs simulated using the vanilla shoebox ISM. Nonetheless, it represents to our knowledge the first algorithmic demonstration that this difficult inverse problem is in-principle fully solvable over a wide range of configurations.
