Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

Jisheng Bai; Mou Wang; Haohe Liu; Han Yin; Yafei Jia; Siwei Huang; Yutong Du; Dongzhe Zhang; Dongyuan Shi; Woon-Seng Gan; Mark D. Plumbley; Susanto Rahardja; Bin Xiang; Jianfeng Chen

Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang, Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley, Susanto Rahardja, Bin Xiang, Jianfeng Chen

TL;DR

The paper addresses the domain-shift challenge in acoustic scene classification (ASC) by proposing the IEEE ICME 2024 Grand Challenge on Semi-supervised ASC under Domain Shift. It introduces the CAS 2023 dataset (10 scenes across 22 Chinese cities) with a development set containing $4.8$ hours of labeled data and $19.3$ hours of unlabeled data, plus an evaluation set (~$3$ hours) featuring unseen cities, to stress cross-region generalization. A baseline semi-supervised pipeline built on SE-Trans with pseudo-labeling and pre-training on the TAU UAS $2020$ Mobile dataset achieves a macro-average accuracy of $59\%$ on the evaluation set, with notable per-class variability (e.g., $90\%$ for Metro vs. $29\%$ for Public square). These resources aim to drive robust ASC models that generalize across devices and geographic regions, encouraging innovative semi-supervised methods for real-world deployment.

Abstract

Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis, and it aims to recognize the unique acoustic characteristics of an environment. One of the challenges of the ASC task is the domain shift between training and testing data. Since 2018, ASC challenges have focused on the generalization of ASC models across different recording devices. Although this task, in recent years, has achieved substantial progress in device generalization, the challenge of domain shift between different geographical regions, involving discrepancies such as time, space, culture, and language, remains insufficiently explored at present. In addition, considering the abundance of unlabeled acoustic scene data in the real world, it is important to study the possible ways to utilize these unlabelled data. Therefore, we introduce the task Semi-supervised Acoustic Scene Classification under Domain Shift in the ICME 2024 Grand Challenge. We encourage participants to innovate with semi-supervised learning techniques, aiming to develop more robust ASC models under domain shift.

Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

TL;DR

hours of labeled data and

hours of unlabeled data, plus an evaluation set (~

hours) featuring unseen cities, to stress cross-region generalization. A baseline semi-supervised pipeline built on SE-Trans with pseudo-labeling and pre-training on the TAU UAS

Mobile dataset achieves a macro-average accuracy of

on the evaluation set, with notable per-class variability (e.g.,

for Metro vs.

for Public square). These resources aim to drive robust ASC models that generalize across devices and geographic regions, encouraging innovative semi-supervised methods for real-world deployment.

Abstract

Paper Structure (15 sections, 1 equation, 3 figures, 2 tables)

This paper contains 15 sections, 1 equation, 3 figures, 2 tables.

Introduction
Dataset
The Chinese Acoustic Scene Dataset
Recording Device
Recording Procedure
Recording Annotation
Challenge Datasets
Baseline
Overview
Baseline Model Architecture
Experimental Setups
Evaluation
Baseline Results
Conclusion
Acknowledgement

Figures (3)

Figure 1: The domain shift problem in acoustic scene classification.
Figure 2: The recording device of the CAS dataset. The left side of the figure shows the physical representation of the device, and the right side displays the relevant dimensional parameters of the device, measured in millimeters.
Figure 3: The pipeline of the challenge baseline.

Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

TL;DR

Abstract

Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift

Authors

TL;DR

Abstract

Table of Contents

Figures (3)