Table of Contents
Fetching ...

Feature-Aware Noise Contrastive Learning for Unsupervised Red Panda Re-Identification

Jincheng Zhang, Qijun Zhao, Tie Liu

TL;DR

This work tackles unsupervised re-identification for red pandas by introducing FANCL, a dual-branch framework that processes original and feature-aware noised images. The method employs a Feature-Aware Noise Addition module to generate challenging perturbed views and uses cluster- and consistency-based contrastive losses with a memory-bank setup to learn robust, discriminative representations without labels. Key contributions include the first application of USL to animal re-ID, a concrete noise-augmentation strategy guided by feature activations, and a joint loss framework that leverages cluster-level and instance-level consistency to approach supervised-level performance. The approach demonstrates strong performance on a red panda dataset across indoor and outdoor settings, highlighting its practical potential for scalable wildlife monitoring and identification in challenging real-world conditions.

Abstract

To facilitate the re-identification (re-ID) of individual animals, existing methods primarily focus on maximizing feature similarity within the same individual and enhancing distinctiveness between different individuals. However, most of them still rely on supervised learning and require substantial labeled data, which is challenging to obtain. To avoid this issue, we propose Feature-Aware Noise Contrastive Learning (FANCL) method to explore an unsupervised learning solution, which is then validated on the task of red panda re-ID. FANCL designs a Feature-Aware Noise Addition module to produce noised images that conceal critical features, and employs two contrastive learning modules to calculate the losses. Firstly, a feature consistency module is designed to bridge the gap between the original and noised features. Secondly, the neural networks are trained through a cluster contrastive learning module. Through these more challenging learning tasks, FANCL can adaptively extract deeper representations of red pandas. The experimental results on a set of red panda images collected in both indoor and outdoor environments prove that FANCL outperforms several related state-of-the-art unsupervised methods, achieving high performance comparable to supervised learning methods.

Feature-Aware Noise Contrastive Learning for Unsupervised Red Panda Re-Identification

TL;DR

This work tackles unsupervised re-identification for red pandas by introducing FANCL, a dual-branch framework that processes original and feature-aware noised images. The method employs a Feature-Aware Noise Addition module to generate challenging perturbed views and uses cluster- and consistency-based contrastive losses with a memory-bank setup to learn robust, discriminative representations without labels. Key contributions include the first application of USL to animal re-ID, a concrete noise-augmentation strategy guided by feature activations, and a joint loss framework that leverages cluster-level and instance-level consistency to approach supervised-level performance. The approach demonstrates strong performance on a red panda dataset across indoor and outdoor settings, highlighting its practical potential for scalable wildlife monitoring and identification in challenging real-world conditions.

Abstract

To facilitate the re-identification (re-ID) of individual animals, existing methods primarily focus on maximizing feature similarity within the same individual and enhancing distinctiveness between different individuals. However, most of them still rely on supervised learning and require substantial labeled data, which is challenging to obtain. To avoid this issue, we propose Feature-Aware Noise Contrastive Learning (FANCL) method to explore an unsupervised learning solution, which is then validated on the task of red panda re-ID. FANCL designs a Feature-Aware Noise Addition module to produce noised images that conceal critical features, and employs two contrastive learning modules to calculate the losses. Firstly, a feature consistency module is designed to bridge the gap between the original and noised features. Secondly, the neural networks are trained through a cluster contrastive learning module. Through these more challenging learning tasks, FANCL can adaptively extract deeper representations of red pandas. The experimental results on a set of red panda images collected in both indoor and outdoor environments prove that FANCL outperforms several related state-of-the-art unsupervised methods, achieving high performance comparable to supervised learning methods.
Paper Structure (15 sections, 10 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 10 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: The Top row shows one red panda in different poses and backgrounds. The bottom row visualizes their corresponding Grad-CAM selvaraju2017grad attention maps obtained by the ResNet50 he2016deep baseline model trained in a supervised manner.
  • Figure 2: The framework of our proposed method consists of four modules: Feature-Aware Noise Addition module, Forward Initialization module, Clustering module, and Contrastive Loss Calculation module. The Feature-Aware Noise Addition module (a) obtains a noised image by selecting feature-aware regions from the input image based on the activation map. In the Clustering module (b), pseudo-labels are assigned by clustering the original features on an unlabeled dataset. The Forward Initialization module obtains features for each input and fused features, and finally, the model is trained based on cluster contrastive learning loss and consistency contrastive learning loss.
  • Figure 3: (a) and (b) respectively demonstrate the changes in the model's mAP and Rank-1 metrics as the proportion of added noisy space increases.
  • Figure 4: Visualization of the Feature-Aware Noise Addition (FANA) module function. (a) Original image, (b) and (e) Activation maps of (a) in the convolution layer and batch normalization layer. (c) and (f) Feature-aware regions selected based on the activation maps. (d) and (g) Noised images obtained by adding noise to the corresponding regions in (a).
  • Figure 5: Visualization of the retrieval results for example query red panda images by our proposed method. The green and red bounding boxes indicate correct and incorrect matches, respectively.