Table of Contents
Fetching ...

Online Domain-Incremental Learning Approach to Classify Acoustic Scenes in All Locations

Manjunath Mulimani, Annamaria Mesaros

TL;DR

This work tackles catastrophic forgetting in acoustic scene classification under sequential location-based domain shifts. It introduces online Domain-Incremental Learning (ODIL), which updates only Batch Normalization statistics with an adaptive momentum schedule using a small set of unlabeled samples per new domain, avoiding backpropagation and retraining. ODIL demonstrates resilience to large domain shifts (including a Korean domain) across 11 locations, achieving an online average accuracy of $48.8\%$ after the final domain and outperforming baselines such as Fine-Tuning and Disjoint models. The approach offers a lightweight, data-efficient path for continual ASC in realistic, location-diverse deployments, with BN-statistics adaptation as the key mechanism for domain transfer without forgetting.

Abstract

In this paper, we propose a method for online domain-incremental learning of acoustic scene classification from a sequence of different locations. Simply training a deep learning model on a sequence of different locations leads to forgetting of previously learned knowledge. In this work, we only correct the statistics of the Batch Normalization layers of a model using a few samples to learn the acoustic scenes from a new location without any excessive training. Experiments are performed on acoustic scenes from 11 different locations, with an initial task containing acoustic scenes from 6 locations and the remaining 5 incremental tasks each representing the acoustic scenes from a different location. The proposed approach outperforms fine-tuning based methods and achieves an average accuracy of 48.8% after learning the last task in sequence without forgetting acoustic scenes from the previously learned locations.

Online Domain-Incremental Learning Approach to Classify Acoustic Scenes in All Locations

TL;DR

This work tackles catastrophic forgetting in acoustic scene classification under sequential location-based domain shifts. It introduces online Domain-Incremental Learning (ODIL), which updates only Batch Normalization statistics with an adaptive momentum schedule using a small set of unlabeled samples per new domain, avoiding backpropagation and retraining. ODIL demonstrates resilience to large domain shifts (including a Korean domain) across 11 locations, achieving an online average accuracy of after the final domain and outperforming baselines such as Fine-Tuning and Disjoint models. The approach offers a lightweight, data-efficient path for continual ASC in realistic, location-diverse deployments, with BN-statistics adaptation as the key mechanism for domain transfer without forgetting.

Abstract

In this paper, we propose a method for online domain-incremental learning of acoustic scene classification from a sequence of different locations. Simply training a deep learning model on a sequence of different locations leads to forgetting of previously learned knowledge. In this work, we only correct the statistics of the Batch Normalization layers of a model using a few samples to learn the acoustic scenes from a new location without any excessive training. Experiments are performed on acoustic scenes from 11 different locations, with an initial task containing acoustic scenes from 6 locations and the remaining 5 incremental tasks each representing the acoustic scenes from a different location. The proposed approach outperforms fine-tuning based methods and achieves an average accuracy of 48.8% after learning the last task in sequence without forgetting acoustic scenes from the previously learned locations.
Paper Structure (11 sections, 3 equations, 4 figures, 3 tables)

This paper contains 11 sections, 3 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Overview of the proposed online Domain-Incremental Learning approach. Inputs to the DIL model are the test sample and the task ID. The frozen model $\mathcal{M}$ uses domain-specific statistics to classify the acoustic scenes from a particular domain.
  • Figure 2: Performance of the methods in online setting: accuracy at the current domain and average forgetting over previous domains.
  • Figure 3: Performance of the methods in offline setting: accuracy at the current domain and average forgetting over previous domains.
  • Figure 4: Accuracy of the base and ODIL at different domains