Table of Contents
Fetching ...

Frequency Domain Nuances Mining for Visible-Infrared Person Re-identification

Yukang Zhang, Yang Lu, Yan Yan, Hanzi Wang, Xuelong Li

TL;DR

This work tackles the visible-infrared person re-identification problem by exploiting frequency-domain information to bridge VIS and IR modalities. It introduces Frequency Domain Nuances Mining (FDNM), which comprises an Amplitude Guided Phase (AGP) module and an Amplitude Nuances Mining (ANM) module, plus a center-guided nuances mining loss to preserve discriminative identity information while discovering cross-modality nuances. The approach achieves state-of-the-art results on SYSU-MM01, RegDB, and LLCM, and demonstrates strong generalization to VIS-IR face recognition, highlighting the practical impact of frequency-domain representations for cross-modality matching. Overall, FDNM establishes a principled framework for jointly leveraging amplitude and phase information to reduce modality gaps and improve re-identification performance.

Abstract

The key of visible-infrared person re-identification (VIReID) lies in how to minimize the modality discrepancy between visible and infrared images. Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information. To address this issue, this paper aims to reduce the modality discrepancy from the frequency domain perspective. Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information, which mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module. These two modules are mutually beneficial to jointly explore frequency domain visible-infrared nuances, thereby effectively reducing the modality discrepancy in the frequency domain. Besides, we propose a center-guided nuances mining loss to encourage the ANM module to preserve discriminative identity information while discovering diverse cross-modality nuances. Extensive experiments show that the proposed FDNM has significant advantages in improving the performance of VIReID. Specifically, our method outperforms the second-best method by 5.2\% in Rank-1 accuracy and 5.8\% in mAP on the SYSU-MM01 dataset under the indoor search mode, respectively. Besides, we also validate the effectiveness and generalization of our method on the challenging visible-infrared face recognition task. \textcolor{magenta}{The code will be available.}

Frequency Domain Nuances Mining for Visible-Infrared Person Re-identification

TL;DR

This work tackles the visible-infrared person re-identification problem by exploiting frequency-domain information to bridge VIS and IR modalities. It introduces Frequency Domain Nuances Mining (FDNM), which comprises an Amplitude Guided Phase (AGP) module and an Amplitude Nuances Mining (ANM) module, plus a center-guided nuances mining loss to preserve discriminative identity information while discovering cross-modality nuances. The approach achieves state-of-the-art results on SYSU-MM01, RegDB, and LLCM, and demonstrates strong generalization to VIS-IR face recognition, highlighting the practical impact of frequency-domain representations for cross-modality matching. Overall, FDNM establishes a principled framework for jointly leveraging amplitude and phase information to reduce modality gaps and improve re-identification performance.

Abstract

The key of visible-infrared person re-identification (VIReID) lies in how to minimize the modality discrepancy between visible and infrared images. Existing methods mainly exploit the spatial information while ignoring the discriminative frequency information. To address this issue, this paper aims to reduce the modality discrepancy from the frequency domain perspective. Specifically, we propose a novel Frequency Domain Nuances Mining (FDNM) method to explore the cross-modality frequency domain information, which mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module. These two modules are mutually beneficial to jointly explore frequency domain visible-infrared nuances, thereby effectively reducing the modality discrepancy in the frequency domain. Besides, we propose a center-guided nuances mining loss to encourage the ANM module to preserve discriminative identity information while discovering diverse cross-modality nuances. Extensive experiments show that the proposed FDNM has significant advantages in improving the performance of VIReID. Specifically, our method outperforms the second-best method by 5.2\% in Rank-1 accuracy and 5.8\% in mAP on the SYSU-MM01 dataset under the indoor search mode, respectively. Besides, we also validate the effectiveness and generalization of our method on the challenging visible-infrared face recognition task. \textcolor{magenta}{The code will be available.}
Paper Structure (19 sections, 9 equations, 7 figures, 5 tables)

This paper contains 19 sections, 9 equations, 7 figures, 5 tables.

Figures (7)

  • Figure 1: Decomposition and reconstruction of the VIS and IR images in the frequency domain. (a) denote the VIS and IR images; (b) are the amplitude and phase components of the VIS and IR images in the frequency domain; (c) are the reconstructed images of swapping the amplitude and phase components of the VIS and IR images; (d) are the reconstructed VIS and IR images with phase component information only; (e) are the reconstructed VIS and IR images with amplitude component information only.
  • Figure 2: The t-SNE laurens2008Visualizing for the VIS and IR features of spatial domain and frequency domain. The samples with the same color are from the same persons. It is apparent that the spatial features and the features of the amplitude component are discriminative while the features of the phase component contain key missing information for the VIReID task.
  • Figure 3: Overview of the proposed FDNM framework for the VIReID task. The proposed FDNM mainly includes an amplitude guided phase (AGP) module and an amplitude nuances mining (ANM) module to reduce the modality discrepancy between VIS and IR images in the frequency domain. After the GAP layer, we propose a center-guided nuances mining loss $\mathcal{L}_{cnm}$ to encourage the proposed ANM module to preserve discriminative identity information while discovering diverse cross-modality nuances.
  • Figure 4: Illustration of the proposed center-guided nuances mining loss, which is used to preserve discriminative identity information while discovering diverse cross-modality nuances.
  • Figure 5: Comparisons of different $\lambda_2$ and $m$ values on the SYSU-MM01 dataset.
  • ...and 2 more figures