Steered Response Power-Based Direction-of-Arrival Estimation Exploiting an Auxiliary Microphone
Klaus Brümann, Simon Doclo
TL;DR
This work tackles DOA estimation for compact microphone arrays in the presence of coherent noise and reverberation. It introduces an auxiliary microphone spatially separated from the CMA and computes CMA SRP-PHAT spectra from the auxiliary–CMA spectra via a product relation, aiming to enhance robustness over conventional SRP-PHAT. The authors provide a theoretical distortion analysis using $D_{ij}(\omega)$, $D_{ij}^{A}(\omega)$ and SUR, and validate the approach with extensive simulations showing improved DOA accuracy for many auxiliary placements, especially as the separation increases. The method offers a practical way to boost DOA performance in challenging acoustic environments when an extra microphone is available.
Abstract
Accurately estimating the direction-of-arrival (DOA) of a speech source using a compact microphone array (CMA) is often complicated by background noise and reverberation. A commonly used DOA estimation method is the steered response power with phase transform (SRP-PHAT) function, which has been shown to work reliably in moderate levels of noise and reverberation. Since for closely spaced microphones the spatial coherence of noise and reverberation may be high over an extended frequency range, this may negatively affect the SRP-PHAT spectra, resulting in DOA estimation errors. Assuming the availability of an auxiliary microphone at an unknown position which is spatially separated from the CMA, in this paper we propose to compute the SRP-PHAT spectra between the microphones of the CMA based on the SRP-PHAT spectra between the auxiliary microphone and the microphones of the CMA. For different levels of noise and reverberation, we show how far the auxiliary microphone needs to be spatially separated from the CMA for the auxiliary microphone-based SRP-PHAT spectra to be more reliable than the SRP-PHAT spectra without the auxiliary microphone. These findings are validated based on simulated microphone signals for several auxiliary microphone positions and two different noise and reverberation conditions.
