Table of Contents
Fetching ...

Adaptive Learning via a Negative Selection Strategy for Few-Shot Bioacoustic Event Detection

Yaxiong Chen, Xueping Zhang, Yunfei Zi, Shengwu Xiong

TL;DR

This work tackles few-shot bioacoustic event detection by identifying two core issues: unrepresentative negative prototypes and variable target durations across tasks when using ProtoNet. It introduces a negative selection strategy to augment negative samples and a teacher–student adaptive learning framework with a duration-aware loss that modulates knowledge transfer, using BEATs as the teacher and CNN as the student. The approach builds a task-specific classifier through multiple prototype updates and a negative prototype augmentation step, and it employs a duration-weighted KL-divergence loss plus mutual information to guide learning without labels. Evaluated on the DCASE 2023 Task5 dataset, the method achieves an F-measure of $0.703$, representing substantial gains over baselines and competing methods, and demonstrates robustness across tasks with short and long vocalisations. Overall, the combination of negative selection and adaptive learning enhances ProtoNet for variable-duration, few-shot BED with practical impact for rapid acoustic event detection.

Abstract

Although the Prototypical Network (ProtoNet) has demonstrated effectiveness in few-shot biological event detection, two persistent issues remain. Firstly, there is difficulty in constructing a representative negative prototype due to the absence of explicitly annotated negative samples. Secondly, the durations of the target biological vocalisations vary across tasks, making it challenging for the model to consistently yield optimal results across all tasks. To address these issues, we propose a novel adaptive learning framework with an adaptive learning loss to guide classifier updates. Additionally, we propose a negative selection strategy to construct a more representative negative prototype for ProtoNet. All experiments ware performed on the DCASE 2023 TASK5 few-shot bioacoustic event detection dataset. The results show that our proposed method achieves an F-measure of 0.703, an improvement of 12.84%.

Adaptive Learning via a Negative Selection Strategy for Few-Shot Bioacoustic Event Detection

TL;DR

This work tackles few-shot bioacoustic event detection by identifying two core issues: unrepresentative negative prototypes and variable target durations across tasks when using ProtoNet. It introduces a negative selection strategy to augment negative samples and a teacher–student adaptive learning framework with a duration-aware loss that modulates knowledge transfer, using BEATs as the teacher and CNN as the student. The approach builds a task-specific classifier through multiple prototype updates and a negative prototype augmentation step, and it employs a duration-weighted KL-divergence loss plus mutual information to guide learning without labels. Evaluated on the DCASE 2023 Task5 dataset, the method achieves an F-measure of , representing substantial gains over baselines and competing methods, and demonstrates robustness across tasks with short and long vocalisations. Overall, the combination of negative selection and adaptive learning enhances ProtoNet for variable-duration, few-shot BED with practical impact for rapid acoustic event detection.

Abstract

Although the Prototypical Network (ProtoNet) has demonstrated effectiveness in few-shot biological event detection, two persistent issues remain. Firstly, there is difficulty in constructing a representative negative prototype due to the absence of explicitly annotated negative samples. Secondly, the durations of the target biological vocalisations vary across tasks, making it challenging for the model to consistently yield optimal results across all tasks. To address these issues, we propose a novel adaptive learning framework with an adaptive learning loss to guide classifier updates. Additionally, we propose a negative selection strategy to construct a more representative negative prototype for ProtoNet. All experiments ware performed on the DCASE 2023 TASK5 few-shot bioacoustic event detection dataset. The results show that our proposed method achieves an F-measure of 0.703, an improvement of 12.84%.
Paper Structure (11 sections, 10 equations, 2 figures, 4 tables)

This paper contains 11 sections, 10 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: The architecture of proposed adaptive learning framework for few-shot bioacoustic event detection.
  • Figure 2: Model prediction performance visualisation.