Table of Contents
Fetching ...

Spectral Enhancement and Pseudo-Anchor Guidance for Infrared-Visible Person Re-Identification

Yiyuan Ge, Zhihao Chen, Ziyang Wang, Jiaju Kang, Mingya Zhang

TL;DR

This paper tackles visible-infrared person re-identification (VI-ReID) by bridging the large spectral gap between infrared and visible images. It introduces SEPG-Net, which combines a spectral enhancement pipeline that converts visible RGB into Semantically Enhanced Grey (SEG) images using greyscale transformation and frequency-domain phase cues, with a weight-shareable dual-stream network and a Pseudo Anchor-Guided Bidirectional Aggregation (PABA) loss to align cross-modality features while preserving discriminative identity information. The PABA loss enables fine-grained, bidirectional cross-modality aggregation by leveraging pseudo-anchors within modality-specific feature chunks. Empirical results on RegDB and SYSU-MM01 show that SEPG-Net achieves state-of-the-art performance, with ablations confirming the beneficial impact of both spectral enhancement and the PABA loss. The approach provides a simple, effective alternative to GAN-based transformations for cross-spectral alignment, with practical implications for robust 24-hour VI-ReID systems.

Abstract

The development of deep learning has facilitated the application of person re-identification (ReID) technology in intelligent security. Visible-infrared person re-identification (VI-ReID) aims to match pedestrians across infrared and visible modality images enabling 24-hour surveillance. Current studies relying on unsupervised modality transformations as well as inefficient embedding constraints to bridge the spectral differences between infrared and visible images, however, limit their potential performance. To tackle the limitations of the above approaches, this paper introduces a simple yet effective Spectral Enhancement and Pseudo-anchor Guidance Network, named SEPG-Net. Specifically, we propose a more homogeneous spectral enhancement scheme based on frequency domain information and greyscale space, which avoids the information loss typically caused by inefficient modality transformations. Further, a Pseudo Anchor-guided Bidirectional Aggregation (PABA) loss is introduced to bridge local modality discrepancies while better preserving discriminative identity embeddings. Experimental results on two public benchmark datasets demonstrate the superior performance of SEPG-Net against other state-of-the-art methods. The code is available at https://github.com/1024AILab/ReID-SEPG.

Spectral Enhancement and Pseudo-Anchor Guidance for Infrared-Visible Person Re-Identification

TL;DR

This paper tackles visible-infrared person re-identification (VI-ReID) by bridging the large spectral gap between infrared and visible images. It introduces SEPG-Net, which combines a spectral enhancement pipeline that converts visible RGB into Semantically Enhanced Grey (SEG) images using greyscale transformation and frequency-domain phase cues, with a weight-shareable dual-stream network and a Pseudo Anchor-Guided Bidirectional Aggregation (PABA) loss to align cross-modality features while preserving discriminative identity information. The PABA loss enables fine-grained, bidirectional cross-modality aggregation by leveraging pseudo-anchors within modality-specific feature chunks. Empirical results on RegDB and SYSU-MM01 show that SEPG-Net achieves state-of-the-art performance, with ablations confirming the beneficial impact of both spectral enhancement and the PABA loss. The approach provides a simple, effective alternative to GAN-based transformations for cross-spectral alignment, with practical implications for robust 24-hour VI-ReID systems.

Abstract

The development of deep learning has facilitated the application of person re-identification (ReID) technology in intelligent security. Visible-infrared person re-identification (VI-ReID) aims to match pedestrians across infrared and visible modality images enabling 24-hour surveillance. Current studies relying on unsupervised modality transformations as well as inefficient embedding constraints to bridge the spectral differences between infrared and visible images, however, limit their potential performance. To tackle the limitations of the above approaches, this paper introduces a simple yet effective Spectral Enhancement and Pseudo-anchor Guidance Network, named SEPG-Net. Specifically, we propose a more homogeneous spectral enhancement scheme based on frequency domain information and greyscale space, which avoids the information loss typically caused by inefficient modality transformations. Further, a Pseudo Anchor-guided Bidirectional Aggregation (PABA) loss is introduced to bridge local modality discrepancies while better preserving discriminative identity embeddings. Experimental results on two public benchmark datasets demonstrate the superior performance of SEPG-Net against other state-of-the-art methods. The code is available at https://github.com/1024AILab/ReID-SEPG.
Paper Structure (11 sections, 9 equations, 4 figures, 2 tables)

This paper contains 11 sections, 9 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: (a) Visible and infrared images are captured by visible cameras and dedicated infrared cameras, respectively (example images from the RegDB datasetb20). (b) A brief illustration of the VI-ReID task (example images taken from the SYSU-MM01 datasetb21).
  • Figure 2: The overall architecture of the proposed SEPG-Net. First, we reduce the spectral discrepancies between the RGB and the IR images with channel transformation as well as frequency-domain enhancement, so SEG images are generated. Then, we utilise a dual-stream weight-shareable network to extract features. Finally, identity loss and PABA loss are employed to supervise the whole learning process.
  • Figure 3: (a) We perform spectral feature enhancement by converting visible/RGB image into SEG image, and the enhanced image is similar in style to infrared (IR) image. (b) Details of SEG image generation.
  • Figure 4: t-SNE feature visualisation of baseline (a) and SEPG-Net (b) on SYSU-MM01. Blue and orange represent the infrared and visible modalities, respectively.