Table of Contents
Fetching ...

An Automated Pipeline for Few-Shot Bird Call Classification: A Case Study with the Tooth-Billed Pigeon

Abhishek Jana, Moeumu Uili, James Atherton, Mark O'Brien, Joe Wood, Leandra Brickson

TL;DR

This work tackles the challenge of detecting extremely rare bird species with minimal labeled data by leveraging embeddings from large, publicly trained classifiers and a cosine-similarity based one-shot classifier. It introduces an automated pipeline with careful preprocessing, embedding-space model selection via clustering metrics, and recall-focused thresholding, validated on a simulated five-species Eastern Towhee dataset and real-world Tooth-Billed Pigeon recordings, achieving recall 1.0 and accuracy 0.951 on the TBP test. The approach is practical for conservation, enabling reliable detection from as little as a single recording and is released as open-source tooling for field deployment. Future work includes integrating source separation techniques and exploring alternative embedding spaces and distance metrics to further improve precision and generalization.

Abstract

This paper presents an automated one-shot bird call classification pipeline designed for rare species absent from large publicly available classifiers like BirdNET and Perch. While these models excel at detecting common birds with abundant training data, they lack options for species with only 1-3 known recordings-a critical limitation for conservationists monitoring the last remaining individuals of endangered birds. To address this, we leverage the embedding space of large bird classification networks and develop a classifier using cosine similarity, combined with filtering and denoising preprocessing techniques, to optimize detection with minimal training data. We evaluate various embedding spaces using clustering metrics and validate our approach in both a simulated scenario with Xeno-Canto recordings and a real-world test on the critically endangered tooth-billed pigeon (Didunculus strigirostris), which has no existing classifiers and only three confirmed recordings. The final model achieved 1.0 recall and 0.95 accuracy in detecting tooth-billed pigeon calls, making it practical for use in the field. This open-source system provides a practical tool for conservationists seeking to detect and monitor rare species on the brink of extinction.

An Automated Pipeline for Few-Shot Bird Call Classification: A Case Study with the Tooth-Billed Pigeon

TL;DR

This work tackles the challenge of detecting extremely rare bird species with minimal labeled data by leveraging embeddings from large, publicly trained classifiers and a cosine-similarity based one-shot classifier. It introduces an automated pipeline with careful preprocessing, embedding-space model selection via clustering metrics, and recall-focused thresholding, validated on a simulated five-species Eastern Towhee dataset and real-world Tooth-Billed Pigeon recordings, achieving recall 1.0 and accuracy 0.951 on the TBP test. The approach is practical for conservation, enabling reliable detection from as little as a single recording and is released as open-source tooling for field deployment. Future work includes integrating source separation techniques and exploring alternative embedding spaces and distance metrics to further improve precision and generalization.

Abstract

This paper presents an automated one-shot bird call classification pipeline designed for rare species absent from large publicly available classifiers like BirdNET and Perch. While these models excel at detecting common birds with abundant training data, they lack options for species with only 1-3 known recordings-a critical limitation for conservationists monitoring the last remaining individuals of endangered birds. To address this, we leverage the embedding space of large bird classification networks and develop a classifier using cosine similarity, combined with filtering and denoising preprocessing techniques, to optimize detection with minimal training data. We evaluate various embedding spaces using clustering metrics and validate our approach in both a simulated scenario with Xeno-Canto recordings and a real-world test on the critically endangered tooth-billed pigeon (Didunculus strigirostris), which has no existing classifiers and only three confirmed recordings. The final model achieved 1.0 recall and 0.95 accuracy in detecting tooth-billed pigeon calls, making it practical for use in the field. This open-source system provides a practical tool for conservationists seeking to detect and monitor rare species on the brink of extinction.

Paper Structure

This paper contains 18 sections, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Overview of the classification pipeline. The upper section illustrates the development process, where key parameters such as duration, frequency range, and threshold are defined, along with the species embedding for the target species. The lower section demonstrates how the finalized model is applied on field data to detect calls from the species of interest.
  • Figure 2: Comparison of raw (left), bandpass filtered (center), and denoised (right) spectrograms of a TBP bird call, shown on a (0-80) dB scale. The bandpassed spectrogram isolates the target frequency band, while the denoised spectrogram highlights essential call features.
  • Figure 3: Average spectrograms of vocalizations from the tooth-billed pigeon (left) and pacific imperial pigeon (right), the species in our dataset with a call most similar to the tooth-billed pigeon.
  • Figure 4: Scatter plot matrix of the top 5 principal components of the Perch model embeddings.
  • Figure 5: Spectrogram of the test dataset after preprocessing and applying the event detector. The red-highlighted regions indicate the call's detected vocalization events.