Table of Contents
Fetching ...

Acoustic identification of individual animals with hierarchical contrastive learning

Ines Nolasco, Ilyass Moummad, Dan Stowell, Emmanouil Benetos

TL;DR

This work frames AIID as a hierarchical multi-label classification task and proposes the use of hierarchy-aware loss functions to learn robust representations of individual identities that maintain the hierarchical relationships among species and taxa, demonstrating the potential of this method in open-set classification scenarios.

Abstract

Acoustic identification of individual animals (AIID) is closely related to audio-based species classification but requires a finer level of detail to distinguish between individual animals within the same species. In this work, we frame AIID as a hierarchical multi-label classification task and propose the use of hierarchy-aware loss functions to learn robust representations of individual identities that maintain the hierarchical relationships among species and taxa. Our results demonstrate that hierarchical embeddings not only enhance identification accuracy at the individual level but also at higher taxonomic levels, effectively preserving the hierarchical structure in the learned representations. By comparing our approach with non-hierarchical models, we highlight the advantage of enforcing this structure in the embedding space. Additionally, we extend the evaluation to the classification of novel individual classes, demonstrating the potential of our method in open-set classification scenarios.

Acoustic identification of individual animals with hierarchical contrastive learning

TL;DR

This work frames AIID as a hierarchical multi-label classification task and proposes the use of hierarchy-aware loss functions to learn robust representations of individual identities that maintain the hierarchical relationships among species and taxa, demonstrating the potential of this method in open-set classification scenarios.

Abstract

Acoustic identification of individual animals (AIID) is closely related to audio-based species classification but requires a finer level of detail to distinguish between individual animals within the same species. In this work, we frame AIID as a hierarchical multi-label classification task and propose the use of hierarchy-aware loss functions to learn robust representations of individual identities that maintain the hierarchical relationships among species and taxa. Our results demonstrate that hierarchical embeddings not only enhance identification accuracy at the individual level but also at higher taxonomic levels, effectively preserving the hierarchical structure in the learned representations. By comparing our approach with non-hierarchical models, we highlight the advantage of enforcing this structure in the embedding space. Additionally, we extend the evaluation to the classification of novel individual classes, demonstrating the potential of our method in open-set classification scenarios.
Paper Structure (14 sections, 3 equations, 1 figure, 3 tables)

This paper contains 14 sections, 3 equations, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Overview of our pretraining pipeline: audio recordings of animal calls are first processed through a frozen, pretrained OpenL3 model to extract high-level representations on each 25ms segment of the audio file. the final embeddings is the average of these across the whole call. The final openl3 embedding is then passed through a MLP to adapt the features specifically for bioacoustic sounds. The adapted features are subsequently fed into a projector to perform supervised contrastive learning for individual identification (ID). For hierarchical contrastive learning, two additional projectors are included for the species (SP) and taxa (T) classification.