Table of Contents
Fetching ...

Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection

Xinyuan Qian, Xianghu Yue, Jiadong Wang, Huiping Zhuang, Haizhou Li

TL;DR

The paper tackles sound source localization (SSL) in a Class Incremental Learning (CIL) setting with privacy protection. It introduces SSL-CIL, which combines an analytic re-alignment step with a recursive, exemplar-free least-squares update to adapt to new DoA classes without revisiting past data, operating on GCC-PHAT–based features ${\bf X}$ and a posterior $p(\theta)$ where $p(\theta)=\exp(-|{\theta}-\dot{\theta}|^2/\sigma^2_{\theta})$ and $\hat{\theta}=\arg\max_\theta \hat{p}(\theta)$. The approach leverages a feature expansion ${\bf X}^{(fe)}$ and derives a closed-form update for the FCN weights via Theorem 1, enabling recursive updates across phases with no data replay. On the SSLR dataset, SSL-CIL achieves $MAE=5.04^{\circ}$ and $ACC=90.9\%$, closely approaching joint training and showing robustness to adverse SNRs.

Abstract

Sound Source Localization (SSL) enabling technology for applications such as surveillance and robotics. While traditional Signal Processing (SP)-based SSL methods provide analytic solutions under specific signal and noise assumptions, recent Deep Learning (DL)-based methods have significantly outperformed them. However, their success depends on extensive training data and substantial computational resources. Moreover, they often rely on large-scale annotated spatial data and may struggle when adapting to evolving sound classes. To mitigate these challenges, we propose a novel Class Incremental Learning (CIL) approach, termed SSL-CIL, which avoids serious accuracy degradation due to catastrophic forgetting by incrementally updating the DL-based SSL model through a closed-form analytic solution. In particular, data privacy is ensured since the learning process does not revisit any historical data (exemplar-free), which is more suitable for smart home scenarios. Empirical results in the public SSLR dataset demonstrate the superior performance of our proposal, achieving a localization accuracy of 90.9%, surpassing other competitive methods.

Analytic Class Incremental Learning for Sound Source Localization with Privacy Protection

TL;DR

The paper tackles sound source localization (SSL) in a Class Incremental Learning (CIL) setting with privacy protection. It introduces SSL-CIL, which combines an analytic re-alignment step with a recursive, exemplar-free least-squares update to adapt to new DoA classes without revisiting past data, operating on GCC-PHAT–based features and a posterior where and . The approach leverages a feature expansion and derives a closed-form update for the FCN weights via Theorem 1, enabling recursive updates across phases with no data replay. On the SSLR dataset, SSL-CIL achieves and , closely approaching joint training and showing robustness to adverse SNRs.

Abstract

Sound Source Localization (SSL) enabling technology for applications such as surveillance and robotics. While traditional Signal Processing (SP)-based SSL methods provide analytic solutions under specific signal and noise assumptions, recent Deep Learning (DL)-based methods have significantly outperformed them. However, their success depends on extensive training data and substantial computational resources. Moreover, they often rely on large-scale annotated spatial data and may struggle when adapting to evolving sound classes. To mitigate these challenges, we propose a novel Class Incremental Learning (CIL) approach, termed SSL-CIL, which avoids serious accuracy degradation due to catastrophic forgetting by incrementally updating the DL-based SSL model through a closed-form analytic solution. In particular, data privacy is ensured since the learning process does not revisit any historical data (exemplar-free), which is more suitable for smart home scenarios. Empirical results in the public SSLR dataset demonstrate the superior performance of our proposal, achieving a localization accuracy of 90.9%, surpassing other competitive methods.
Paper Structure (8 sections, 23 equations, 1 figure, 4 tables, 1 algorithm)

This paper contains 8 sections, 23 equations, 1 figure, 4 tables, 1 algorithm.

Figures (1)

  • Figure 1: The averaged DoA estimation accuracy at various phases.