Steered Response Power for Sound Source Localization: A Tutorial Review
Eric Grinstein, Elisa Tengan, Bilgesu Çakmak, Thomas Dietzen, Leonardo Nunes, Toon van Waterschoot, Mike Brookes, Patrick A. Naylor
TL;DR
This work surveys the Steered Response Power (SRP) framework for sound source localization (SSL), with a focus on the SRP-PHAT variant and the modular X-SRP implementation. It formalizes SRP in time and frequency domains, clarifies TDOA geometry, and demonstrates a grid-search approach to locate one or more sources, while addressing computational complexity and robustness. The paper catalogs hundreds of extensions across complexity reduction, robustness improvements, multi-source handling, tracking, and practical deployments, and provides a unified, extensible software platform (X-SRP) to facilitate replication and experimentation. Its analysis highlights how SRP remains competitive in reverberant/noisy environments and remains a versatile foundation that can be augmented with neural components, prior information, and sparse/multi-target techniques for scalable SSL in real-world settings.
Abstract
In the last three decades, the Steered Response Power (SRP) method has been widely used for the task of Sound Source Localization (SSL), due to its satisfactory localization performance on moderately reverberant and noisy scenarios. Many works have analyzed and extended the original SRP method to reduce its computational cost, to allow it to locate multiple sources, or to improve its performance in adverse environments. In this work, we review over 200 papers on the SRP method and its variants, with emphasis on the SRP-PHAT method. We also present eXtensible-SRP, or X-SRP, a generalized and modularized version of the SRP algorithm which allows the reviewed extensions to be implemented. We provide a Python implementation of the algorithm which includes selected extensions from the literature.
