Description on IEEE ICME 2024 Grand Challenge: Semi-supervised Acoustic Scene Classification under Domain Shift
Jisheng Bai, Mou Wang, Haohe Liu, Han Yin, Yafei Jia, Siwei Huang, Yutong Du, Dongzhe Zhang, Dongyuan Shi, Woon-Seng Gan, Mark D. Plumbley, Susanto Rahardja, Bin Xiang, Jianfeng Chen
TL;DR
The paper addresses the domain-shift challenge in acoustic scene classification (ASC) by proposing the IEEE ICME 2024 Grand Challenge on Semi-supervised ASC under Domain Shift. It introduces the CAS 2023 dataset (10 scenes across 22 Chinese cities) with a development set containing $4.8$ hours of labeled data and $19.3$ hours of unlabeled data, plus an evaluation set (~$3$ hours) featuring unseen cities, to stress cross-region generalization. A baseline semi-supervised pipeline built on SE-Trans with pseudo-labeling and pre-training on the TAU UAS $2020$ Mobile dataset achieves a macro-average accuracy of $59\%$ on the evaluation set, with notable per-class variability (e.g., $90\%$ for Metro vs. $29\%$ for Public square). These resources aim to drive robust ASC models that generalize across devices and geographic regions, encouraging innovative semi-supervised methods for real-world deployment.
Abstract
Acoustic scene classification (ASC) is a crucial research problem in computational auditory scene analysis, and it aims to recognize the unique acoustic characteristics of an environment. One of the challenges of the ASC task is the domain shift between training and testing data. Since 2018, ASC challenges have focused on the generalization of ASC models across different recording devices. Although this task, in recent years, has achieved substantial progress in device generalization, the challenge of domain shift between different geographical regions, involving discrepancies such as time, space, culture, and language, remains insufficiently explored at present. In addition, considering the abundance of unlabeled acoustic scene data in the real world, it is important to study the possible ways to utilize these unlabelled data. Therefore, we introduce the task Semi-supervised Acoustic Scene Classification under Domain Shift in the ICME 2024 Grand Challenge. We encourage participants to innovate with semi-supervised learning techniques, aiming to develop more robust ASC models under domain shift.
