Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection
Xinyuan Qian, Xianghu Yue, Jiadong Wang, Huiping Zhuang, Haizhou Li
TL;DR
The paper tackles sound source localization (SSL) in a Class Incremental Learning (CIL) setting with privacy protection. It introduces SSL-CIL, which combines an analytic re-alignment step with a recursive, exemplar-free least-squares update to adapt to new DoA classes without revisiting past data, operating on GCC-PHAT–based features ${\bf X}$ and a posterior $p(\theta)$ where $p(\theta)=\exp(-|{\theta}-\dot{\theta}|^2/\sigma^2_{\theta})$ and $\hat{\theta}=\arg\max_\theta \hat{p}(\theta)$. The approach leverages a feature expansion ${\bf X}^{(fe)}$ and derives a closed-form update for the FCN weights via Theorem 1, enabling recursive updates across phases with no data replay. On the SSLR dataset, SSL-CIL achieves $MAE=5.04^{\circ}$ and $ACC=90.9\%$, closely approaching joint training and showing robustness to adverse SNRs.
Abstract
Sound Source Localization (SSL) enabling technology for applications such as surveillance and robotics. While traditional Signal Processing (SP)-based SSL methods provide analytic solutions under specific signal and noise assumptions, recent Deep Learning (DL)-based methods have significantly outperformed them. However, their success depends on extensive training data and substantial computational resources. Moreover, they often rely on large-scale annotated spatial data and may struggle when adapting to evolving sound classes. To mitigate these challenges, we propose a novel Class Incremental Learning (CIL) approach, termed SSL-CIL, which avoids serious accuracy degradation due to catastrophic forgetting by incrementally updating the DL-based SSL model through a closed-form analytic solution. In particular, data privacy is ensured since the learning process does not revisit any historical data (exemplar-free), which is more suitable for smart home scenarios. Empirical results in the public SSLR dataset demonstrate the superior performance of our proposal, achieving a localization accuracy of 90.9%, surpassing other competitive methods.
