Table of Contents
Fetching ...

Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds

Ilyass Moummad, Nicolas Farrugia, Romain Serizel, Jeremy Froidevaux, Vincent Lostanlen

TL;DR

This work tackles multi-label imbalanced classification in bioacoustics, focusing on rare anuran calls using the AnuraSet dataset. It introduces Mix2, a probabilistic ensemble of Mixup, Manifold Mixup, and MultiMix that selects among these regularizers per training step, with formalized mixing at input, embedding, and batch levels. The approach yields higher macro F-scores, particularly for rare classes, outperforming single mixing methods and baselines, with robust performance across polyphony levels. By enabling more diverse augmented samples, Mix2 supports more reliable ecological monitoring and sets the stage for future self-supervised, few-shot, and open-set extensions in wildlife sound classification.

Abstract

Multi-label imbalanced classification poses a significant challenge in machine learning, particularly evident in bioacoustics where animal sounds often co-occur, and certain sounds are much less frequent than others. This paper focuses on the specific case of classifying anuran species sounds using the dataset AnuraSet, that contains both class imbalance and multi-label examples. To address these challenges, we introduce Mixture of Mixups (Mix2), a framework that leverages mixing regularization methods Mixup, Manifold Mixup, and MultiMix. Experimental results show that these methods, individually, may lead to suboptimal results; however, when applied randomly, with one selected at each training iteration, they prove effective in addressing the mentioned challenges, particularly for rare classes with few occurrences. Further analysis reveals that Mix2 is also proficient in classifying sounds across various levels of class co-occurrences.

Mixture of Mixups for Multi-label Classification of Rare Anuran Sounds

TL;DR

This work tackles multi-label imbalanced classification in bioacoustics, focusing on rare anuran calls using the AnuraSet dataset. It introduces Mix2, a probabilistic ensemble of Mixup, Manifold Mixup, and MultiMix that selects among these regularizers per training step, with formalized mixing at input, embedding, and batch levels. The approach yields higher macro F-scores, particularly for rare classes, outperforming single mixing methods and baselines, with robust performance across polyphony levels. By enabling more diverse augmented samples, Mix2 supports more reliable ecological monitoring and sets the stage for future self-supervised, few-shot, and open-set extensions in wildlife sound classification.

Abstract

Multi-label imbalanced classification poses a significant challenge in machine learning, particularly evident in bioacoustics where animal sounds often co-occur, and certain sounds are much less frequent than others. This paper focuses on the specific case of classifying anuran species sounds using the dataset AnuraSet, that contains both class imbalance and multi-label examples. To address these challenges, we introduce Mixture of Mixups (Mix2), a framework that leverages mixing regularization methods Mixup, Manifold Mixup, and MultiMix. Experimental results show that these methods, individually, may lead to suboptimal results; however, when applied randomly, with one selected at each training iteration, they prove effective in addressing the mentioned challenges, particularly for rare classes with few occurrences. Further analysis reveals that Mix2 is also proficient in classifying sounds across various levels of class co-occurrences.
Paper Structure (16 sections, 6 equations, 3 figures, 2 tables)

This paper contains 16 sections, 6 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of our system: Mix2. MBN, FC and BCE stand for MobileNetV3-Large, Fully Connected, and Binary Cross-Entropy, respectively.
  • Figure 2: Histogram of class imbalance in AnuraSet. Dashed lines denote "frequent" species (more than 10k instances), "common" species (5k--10k instances), and "rare" species (fewer than 5k instances).
  • Figure 3: In AnuraSet, Mixup leads to a performance decline across various polyphony levels compared to the absence of mixing. Conversely, Mix2 demonstrates an improvement in performance across different polyphony levels.