Table of Contents
Fetching ...

Modal smoothing for analysis of room reflections measured with spherical microphone and loudspeaker arrays

Hai Morgenstern, Boaz Rafaely

TL;DR

This work tackles the problem of rank-deficient cross-spectrum matrices in DOA estimation of direct and early reflections from RIRs measured with spherical MIMO arrays. It introduces modal smoothing, which aggregates over spherical-harmonic (SH) channels of the loudspeaker array to yield a smoothed cross-spectrum with $\tilde{\bm S}_{\bm A}(\omega)=\frac{1}{(N_L+1)^2}\bm A(\omega)\bm A^H(\omega)$, enabling accurate MUSIC-based DOA estimation even when reflections share equal time delays. The approach is analyzed theoretically and validated through simulations, showing that modal smoothing outperforms traditional frequency smoothing in decorrelating equal-delay reflections, and that combining modal with frequency smoothing extends utility for systems with low SH orders. The results indicate that this method can improve estimation of early room reflections in practical acoustic scenarios, particularly when array CHANNELS are limited. Overall, modal smoothing provides a robust rank-restoration mechanism for MIMO spherical arrays, enhancing room-acoustics DOA analyses.

Abstract

Spatial analysis of room acoustics is an ongoing research topic. Microphone arrays have been employed for spatial analyses with an important objective being the estimation of the direction-of-arrival (DOA) of direct sound and early room reflections using room impulse responses (RIRs). An optimal method for DOA estimation is the multiple signal classification algorithm. When RIRs are considered, this method typically fails due to the correlation of room reflections, which leads to rank deficiency of the cross-spectrum matrix. Preprocessing methods for rank restoration, which may involve averaging over frequency, for example, have been proposed exclusively for spherical arrays. However, these methods fail in the case of reflections with equal time delays, which may arise in practice and could be of interest. In this paper, a method is proposed for systems that combine a spherical microphone array and a spherical loudspeaker array, referred to as multiple-input multiple-output systems. This method, referred to as modal smoothing, exploits the additional spatial diversity for rank restoration and succeeds where previous methods fail, as demonstrated in a simulation study. Finally, combining modal smoothing with a preprocessing method is proposed in order to increase the number of DOAs that can be estimated using low-order spherical loudspeaker arrays.

Modal smoothing for analysis of room reflections measured with spherical microphone and loudspeaker arrays

TL;DR

This work tackles the problem of rank-deficient cross-spectrum matrices in DOA estimation of direct and early reflections from RIRs measured with spherical MIMO arrays. It introduces modal smoothing, which aggregates over spherical-harmonic (SH) channels of the loudspeaker array to yield a smoothed cross-spectrum with , enabling accurate MUSIC-based DOA estimation even when reflections share equal time delays. The approach is analyzed theoretically and validated through simulations, showing that modal smoothing outperforms traditional frequency smoothing in decorrelating equal-delay reflections, and that combining modal with frequency smoothing extends utility for systems with low SH orders. The results indicate that this method can improve estimation of early room reflections in practical acoustic scenarios, particularly when array CHANNELS are limited. Overall, modal smoothing provides a robust rank-restoration mechanism for MIMO spherical arrays, enhancing room-acoustics DOA analyses.

Abstract

Spatial analysis of room acoustics is an ongoing research topic. Microphone arrays have been employed for spatial analyses with an important objective being the estimation of the direction-of-arrival (DOA) of direct sound and early room reflections using room impulse responses (RIRs). An optimal method for DOA estimation is the multiple signal classification algorithm. When RIRs are considered, this method typically fails due to the correlation of room reflections, which leads to rank deficiency of the cross-spectrum matrix. Preprocessing methods for rank restoration, which may involve averaging over frequency, for example, have been proposed exclusively for spherical arrays. However, these methods fail in the case of reflections with equal time delays, which may arise in practice and could be of interest. In this paper, a method is proposed for systems that combine a spherical microphone array and a spherical loudspeaker array, referred to as multiple-input multiple-output systems. This method, referred to as modal smoothing, exploits the additional spatial diversity for rank restoration and succeeds where previous methods fail, as demonstrated in a simulation study. Finally, combining modal smoothing with a preprocessing method is proposed in order to increase the number of DOAs that can be estimated using low-order spherical loudspeaker arrays.
Paper Structure (14 sections, 32 equations, 8 figures, 1 table)

This paper contains 14 sections, 32 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: System diagram in the x-y plane; O and X represent the spherical loudspeaker array and the spherical microphone array, respectively. The solid line represents direct sound and the dashed lines represent reflections (only two reflections are illustrated). $\psi_{0}$ and $\phi_{0 }$ are the azimuth angles of the DOR and DOA for the direct sound, respectively. $\psi_{1}$ and $\phi_{1}$ are the corresponding angles for the sound reflected by the wall at the bottom of the figure. In this case, $\beta_{0 } = \theta_{0 } = \beta_{1 } = \theta_{1 } =90^\circ$.
  • Figure 2: (Color online) Simulated RIR of the SH coefficient of order $n=0$ for the spherical microphone array and the spherical loudspeaker array. A Welch window with a 22$\,$ms duration is plotted, starting at $t = 7\,$ms.
  • Figure 3: (Color online) Eigenvalue distribution of $\tilde{\bm S}_{\bm A}(\omega)$ and $\tilde{\bm S}_{\bm a}$ for modal smoothing (MS) and frequency smoothing (FS), respectively.
  • Figure 4: (Color online) MUSIC spectrum for modal smoothing, calculated using $\tilde{\bm S}_{\bm A}(\omega)$. 'X'-and 'O'-marks indicate the true DOAs of the reflections and the estimated DOAs, respectively.
  • Figure 5: (Color online) Same as Fig. \ref{['fig4']}, but for frequency smoothing, calculated using $\tilde{\bm S}_{\bm a}$.
  • ...and 3 more figures