The Neural-SRP method for positional sound source localization

Eric Grinstein; Toon van Waterschoot; Mike Brookes; Patrick A. Naylor

The Neural-SRP method for positional sound source localization

Eric Grinstein, Toon van Waterschoot, Mike Brookes, Patrick A. Naylor

TL;DR

Neural-SRP is proposed, a DNN which combines the flexibility of SRP with the performance gains of DNNs, and results verify that Neural-SRP's localization performance significantly outperforms the baselines.

Abstract

Steered Response Power (SRP) is a widely used method for the task of sound source localization using microphone arrays, showing satisfactory localization performance on many practical scenarios. However, its performance is diminished under highly reverberant environments. Although Deep Neural Networks (DNNs) have been previously proposed to overcome this limitation, most are trained for a specific number of microphones with fixed spatial coordinates. This restricts their practical application on scenarios frequently observed in wireless acoustic sensor networks, where each application has an ad-hoc microphone topology. We propose Neural-SRP, a DNN which combines the flexibility of SRP with the performance gains of DNNs. We train our network using simulated data and transfer learning, and evaluate our approach on recorded and simulated data. Results verify that Neural-SRP's localization performance significantly outperforms the baselines.

The Neural-SRP method for positional sound source localization

TL;DR

Abstract

Paper Structure (14 sections, 10 equations, 4 figures, 1 table)

This paper contains 14 sections, 10 equations, 4 figures, 1 table.

Introduction
Problem statement
Related work
Steered Response Power
Neural Networks for SSL
Neural-SRP
Training targets
Anechoic to reverberant transfer learning
Experimentation
Datasets
Methods and baselines
Experiment details
Results and discussion
Conclusion and future work

Figures (4)

Figure 1: Neural-SRP and classical SRP output for real recorded signals.
Figure 2: Example of the Neural-SRP method for three microphones. The magnitude of the STFT instead of its phase is used as input for illustrative purposes.
Figure 3: Architecture of the Neur al-SRP network.
Figure 4: Example of the hyperbolic grid (below), used for training Neural-SRP, and the alternative Gaussian grid (above).

The Neural-SRP method for positional sound source localization

TL;DR

Abstract

The Neural-SRP method for positional sound source localization

Authors

TL;DR

Abstract

Table of Contents

Figures (4)