Table of Contents
Fetching ...

Analytical model for the relation between signal bandwidth and spatial resolution in Steered-Response Power Phase Transform (SRP-PHAT) maps

Guillermo Garcia-Barrios, Juana M. Gutierrez-Arriola, Nicolas Saenz-Lechon, Victor Jose Osma-Ruiz, Ruben Fraile

TL;DR

This work investigates how acoustic signal bandwidth constrains the spatial resolution of SRP-PHAT maps for sound-source localization without relying on far-field or particular array geometries. It models SRP computation as a sampling process of GCC-PHAT functions and derives a sufficient aliasing-free condition $\|\nabla\tau_{kl}(\mathbf{r})\|\Delta r<\pi/\omega_{\max}$ that ties grid spacing to microphone geometry and signal bandwidth. A practical per-grid-point bandwidth-selection algorithm is proposed, with optional normalization to preserve the main GCC peak, and experiments show significant localization improvements for sources farther from the array, in both anechoic and reverberant environments. The approach complements hierarchical localization and offers a principled way to reduce computational load while maintaining accuracy by adaptively limiting GCC bandwidth across the SRP map.

Abstract

An analysis of the relationship between the bandwidth of acoustic signals and the required resolution of steered-response power phase transform (SRP-PHAT) maps used for sound source localization is presented. This relationship does not rely on the far-field assumption, nor does it depend on any specific array topology. The proposed analysis considers the computation of a SRP map as a process of sampling a set of generalized cross-correlation (GCC) functions, each one corresponding to a different microphone pair. From this approach, we derive a rule that relates GCC bandwidth with inter-microphone distance, resolution of the SRP map, and the potential position of the sound source relative to the array position. This rule is a sufficient condition for an aliasing-free calculation of the specified SRP-PHAT map. Simulation results show that limiting the bandwidth of the GCC according to such rule leads to significant reductions in sound source localization errors when sources are not in the immediate vicinity of the microphone array. These error reductions are more relevant for coarser resolutions of the SRP map, and they happen in both anechoic and reverberant environments.

Analytical model for the relation between signal bandwidth and spatial resolution in Steered-Response Power Phase Transform (SRP-PHAT) maps

TL;DR

This work investigates how acoustic signal bandwidth constrains the spatial resolution of SRP-PHAT maps for sound-source localization without relying on far-field or particular array geometries. It models SRP computation as a sampling process of GCC-PHAT functions and derives a sufficient aliasing-free condition that ties grid spacing to microphone geometry and signal bandwidth. A practical per-grid-point bandwidth-selection algorithm is proposed, with optional normalization to preserve the main GCC peak, and experiments show significant localization improvements for sources farther from the array, in both anechoic and reverberant environments. The approach complements hierarchical localization and offers a principled way to reduce computational load while maintaining accuracy by adaptively limiting GCC bandwidth across the SRP map.

Abstract

An analysis of the relationship between the bandwidth of acoustic signals and the required resolution of steered-response power phase transform (SRP-PHAT) maps used for sound source localization is presented. This relationship does not rely on the far-field assumption, nor does it depend on any specific array topology. The proposed analysis considers the computation of a SRP map as a process of sampling a set of generalized cross-correlation (GCC) functions, each one corresponding to a different microphone pair. From this approach, we derive a rule that relates GCC bandwidth with inter-microphone distance, resolution of the SRP map, and the potential position of the sound source relative to the array position. This rule is a sufficient condition for an aliasing-free calculation of the specified SRP-PHAT map. Simulation results show that limiting the bandwidth of the GCC according to such rule leads to significant reductions in sound source localization errors when sources are not in the immediate vicinity of the microphone array. These error reductions are more relevant for coarser resolutions of the SRP map, and they happen in both anechoic and reverberant environments.
Paper Structure (10 sections, 14 equations, 8 figures, 4 tables)

This paper contains 10 sections, 14 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Simplified scenario comprising two microphones (triangles) and one position (circle).
  • Figure 2: Contour plot of $c\left\|\nabla \tau_{kl}\left(\vec{r}\right)\right\|$ in the horizontal plane when both microphones are in that plane.
  • Figure 3: Plot of $c\left\|\nabla \tau_{kl}\left(\vec{r}\right)\right\|$ as a function of distance for several angles.
  • Figure 4: Cross correlation (GCC-PHAT) between two exactly equal speech signals taken from the dataset described in section \ref{['subsec:AudioData']}, having a 3.76 ms delay between them. The thin line depicts the GCC-PHAT calculated by integration along the 200 Hz-4000 Hz band, the continuous thick line shows the result of limiting this interval to 200 Hz-1000 Hz, and the dotted line shows the effect of applying the proposed normalization to the band-limited GCC-PHAT.
  • Figure 5: SRP-PHAT maps generated according to the standard procedure (left), applying (\ref{['eq:BandlimitedGCCPHAT_2']}) for limiting the bandwidth of the GCC (middle), and adding the normalization in (\ref{['eq:BandlimitedNormalisedGCCPHAT']}) (right). Red points indicate the simulated microphone positions, the filled triangles mark the simulated source position, and the empty triangles show the maximum peaks of the SRP maps, i.e. the estimated source positions. Anechoic conditions have been assumed, and the audio signal used for simulation is the same as in Fig. \ref{['fig:GCC']}.
  • ...and 3 more figures