Modal smoothing for analysis of room reflections measured with spherical microphone and loudspeaker arrays
Hai Morgenstern, Boaz Rafaely
TL;DR
This work tackles the problem of rank-deficient cross-spectrum matrices in DOA estimation of direct and early reflections from RIRs measured with spherical MIMO arrays. It introduces modal smoothing, which aggregates over spherical-harmonic (SH) channels of the loudspeaker array to yield a smoothed cross-spectrum with $\tilde{\bm S}_{\bm A}(\omega)=\frac{1}{(N_L+1)^2}\bm A(\omega)\bm A^H(\omega)$, enabling accurate MUSIC-based DOA estimation even when reflections share equal time delays. The approach is analyzed theoretically and validated through simulations, showing that modal smoothing outperforms traditional frequency smoothing in decorrelating equal-delay reflections, and that combining modal with frequency smoothing extends utility for systems with low SH orders. The results indicate that this method can improve estimation of early room reflections in practical acoustic scenarios, particularly when array CHANNELS are limited. Overall, modal smoothing provides a robust rank-restoration mechanism for MIMO spherical arrays, enhancing room-acoustics DOA analyses.
Abstract
Spatial analysis of room acoustics is an ongoing research topic. Microphone arrays have been employed for spatial analyses with an important objective being the estimation of the direction-of-arrival (DOA) of direct sound and early room reflections using room impulse responses (RIRs). An optimal method for DOA estimation is the multiple signal classification algorithm. When RIRs are considered, this method typically fails due to the correlation of room reflections, which leads to rank deficiency of the cross-spectrum matrix. Preprocessing methods for rank restoration, which may involve averaging over frequency, for example, have been proposed exclusively for spherical arrays. However, these methods fail in the case of reflections with equal time delays, which may arise in practice and could be of interest. In this paper, a method is proposed for systems that combine a spherical microphone array and a spherical loudspeaker array, referred to as multiple-input multiple-output systems. This method, referred to as modal smoothing, exploits the additional spatial diversity for rank restoration and succeeds where previous methods fail, as demonstrated in a simulation study. Finally, combining modal smoothing with a preprocessing method is proposed in order to increase the number of DOAs that can be estimated using low-order spherical loudspeaker arrays.
