Table of Contents
Fetching ...

WildlifeReID-10k: Wildlife re-identification dataset with 10k individual animals

Lukáš Adam, Vojtěch Čermák, Kostas Papafitsoros, Lukas Picek

TL;DR

WildlifeReID-10k tackles the challenge of single-animal re-identification in the wild by compiling a large, diverse benchmark and introducing leakage-resistant evaluation protocols. It combines time-aware splits when timestamps exist with similarity-aware clustering to define encounters, thereby preventing training-to-test data leakage. The paper provides open- and closed-set baselines across CNNs and transformers, evaluates with multiple metrics including a geometric mean of known and unknown-class performance, and demonstrates the need for robust splits through foundation-model experiments. By democratizing access via Kaggle and extending the WildlifeDatasets framework, WildlifeReID-10k offers a standardized, fair platform to measure progress in animal re-identification and its ecological applications.

Abstract

This paper introduces WildlifeReID-10k, a new large-scale re-identification benchmark with more than 10k animal identities of around 33 species across more than 140k images, re-sampled from 37 existing datasets. WildlifeReID-10k covers diverse animal species and poses significant challenges for SoTA methods, ensuring fair and robust evaluation through its time-aware and similarity-aware split protocol. The latter is designed to address the common issue of training-to-test data leakage caused by visually similar images appearing in both training and test sets. The WildlifeReID-10k dataset and benchmark are publicly available on Kaggle, along with strong baselines for both closed-set and open-set evaluation, enabling fair, transparent, and standardized evaluation of not just multi-species animal re-identification models.

WildlifeReID-10k: Wildlife re-identification dataset with 10k individual animals

TL;DR

WildlifeReID-10k tackles the challenge of single-animal re-identification in the wild by compiling a large, diverse benchmark and introducing leakage-resistant evaluation protocols. It combines time-aware splits when timestamps exist with similarity-aware clustering to define encounters, thereby preventing training-to-test data leakage. The paper provides open- and closed-set baselines across CNNs and transformers, evaluates with multiple metrics including a geometric mean of known and unknown-class performance, and demonstrates the need for robust splits through foundation-model experiments. By democratizing access via Kaggle and extending the WildlifeDatasets framework, WildlifeReID-10k offers a standardized, fair platform to measure progress in animal re-identification and its ecological applications.

Abstract

This paper introduces WildlifeReID-10k, a new large-scale re-identification benchmark with more than 10k animal identities of around 33 species across more than 140k images, re-sampled from 37 existing datasets. WildlifeReID-10k covers diverse animal species and poses significant challenges for SoTA methods, ensuring fair and robust evaluation through its time-aware and similarity-aware split protocol. The latter is designed to address the common issue of training-to-test data leakage caused by visually similar images appearing in both training and test sets. The WildlifeReID-10k dataset and benchmark are publicly available on Kaggle, along with strong baselines for both closed-set and open-set evaluation, enabling fair, transparent, and standardized evaluation of not just multi-species animal re-identification models.
Paper Structure (11 sections, 5 equations, 6 figures, 6 tables, 1 algorithm)

This paper contains 11 sections, 5 equations, 6 figures, 6 tables, 1 algorithm.

Figures (6)

  • Figure 1: WildlifeReID-10k splitting methodology. Data collected in a single encounter is usually split randomly and might be corrupted by unwanted training-to-test set leakage. Therefore, we employ (i) a time-aware split if timestamps are available, and (ii) a similarity-aware split, where visually similar images are treated as a single observation which is assigned to either the training, validation, or test set.
  • Figure 2: Number of images per individual in WildlifeReID-10k.
  • Figure 3: Size of clusters and their impurity based on $\theta$.
  • Figure 4: Found clusters (columns) in LeopardID2022 dataset. Most clusters consist of almost identical images with only small differences in size or the leopard's head position.
  • Figure 5: Performance difference between the proposed time-aware or similarity-aware split (orange) and the random split (blue) for datasets with timestamps (full lines) and without timestamps (dotted lines). Each point at a curve corresponds to a different threshold $t_{\rm new}$. This figure suggests that the random split artificially inflates the performance of methods.
  • ...and 1 more figures