Table of Contents
Fetching ...

Adaptive High-Frequency Transformer for Diverse Wildlife Re-Identification

Chenyue Li, Shuoyi Chen, Mang Ye

TL;DR

This work proposes the Adaptive High-Frequency Transformer model, a unified, multi-species general framework for wildlife ReID, and introduces an object-aware high-frequency selection strategy to adaptively capture more valuable high-frequency components.

Abstract

Wildlife ReID involves utilizing visual technology to identify specific individuals of wild animals in different scenarios, holding significant importance for wildlife conservation, ecological research, and environmental monitoring. Existing wildlife ReID methods are predominantly tailored to specific species, exhibiting limited applicability. Although some approaches leverage extensively studied person ReID techniques, they struggle to address the unique challenges posed by wildlife. Therefore, in this paper, we present a unified, multi-species general framework for wildlife ReID. Given that high-frequency information is a consistent representation of unique features in various species, significantly aiding in identifying contours and details such as fur textures, we propose the Adaptive High-Frequency Transformer model with the goal of enhancing high-frequency information learning. To mitigate the inevitable high-frequency interference in the wilderness environment, we introduce an object-aware high-frequency selection strategy to adaptively capture more valuable high-frequency components. Notably, we unify the experimental settings of multiple wildlife datasets for ReID, achieving superior performance over state-of-the-art ReID methods. In domain generalization scenarios, our approach demonstrates robust generalization to unknown species.

Adaptive High-Frequency Transformer for Diverse Wildlife Re-Identification

TL;DR

This work proposes the Adaptive High-Frequency Transformer model, a unified, multi-species general framework for wildlife ReID, and introduces an object-aware high-frequency selection strategy to adaptively capture more valuable high-frequency components.

Abstract

Wildlife ReID involves utilizing visual technology to identify specific individuals of wild animals in different scenarios, holding significant importance for wildlife conservation, ecological research, and environmental monitoring. Existing wildlife ReID methods are predominantly tailored to specific species, exhibiting limited applicability. Although some approaches leverage extensively studied person ReID techniques, they struggle to address the unique challenges posed by wildlife. Therefore, in this paper, we present a unified, multi-species general framework for wildlife ReID. Given that high-frequency information is a consistent representation of unique features in various species, significantly aiding in identifying contours and details such as fur textures, we propose the Adaptive High-Frequency Transformer model with the goal of enhancing high-frequency information learning. To mitigate the inevitable high-frequency interference in the wilderness environment, we introduce an object-aware high-frequency selection strategy to adaptively capture more valuable high-frequency components. Notably, we unify the experimental settings of multiple wildlife datasets for ReID, achieving superior performance over state-of-the-art ReID methods. In domain generalization scenarios, our approach demonstrates robust generalization to unknown species.

Paper Structure

This paper contains 12 sections, 10 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: By capturing discriminative features such as texture, contour, and fine details, high-frequency information displays unique features specific to each wild species, playing a crucial role in universal wildlife re-identification.
  • Figure 2: The architecture of our proposed method, consisting of (a) Frequency-Domain Mixed Augmentation(described in Sec.\ref{['FMA']}), (b) Object-Aware Dynamic Selection (described in Sec.\ref{['OMS']}), (c) Feature Equilibrium Loss (described in Sec.\ref{['dgc']}).
  • Figure 3: Parameter evaluation. mAP results for varying $\mu$ are compared across several datasets. Different weights $\lambda$ are analyzed on the Panda dataset.
  • Figure 4: Visualizing the attention maps for the class token from the last self-attention layer. [Base] denotes the baseline. [Pure HF] refers to the Pure High-Frequency Augmentation as detailed in Table.\ref{['ablation']}. [Our] means the method we proposed in this paper.