Table of Contents
Fetching ...

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

Jiangming Shi, Xiangbo Yin, Yeyun Chen, Yachao Zhang, Zhizhong Zhang, Yuan Xie, Yanyun Qu

TL;DR

This work tackles unsupervised visible–infrared person re-identification by exposing reliability gaps in cross-modality pseudo-labels and correspondences. It proposes Multi-Memory Matching (MMM), comprising Cross-Modarity Clustering (CMC) to generate joint intra- and inter-modality pseudo-labels, Multi-Memory Learning and Matching (MMLM) to exploit multi-memory representations and a bipartite matching process, and Soft Cluster-level Alignment (SCA) to narrow modality gaps with noise-robust, soft alignments. The approach introduces ARI as a reliability metric and demonstrates state-of-the-art performance on SYSU-MM01 and RegDB, along with extensive ablations and hyper-parameter analyses. The work advances practical USL-VI-ReID by enabling more faithful cross-modality correspondences and providing code for reproducibility.

Abstract

Unsupervised visible-infrared person re-identification (USL-VI-ReID) is a promising yet challenging retrieval task. The key challenges in USL-VI-ReID are to effectively generate pseudo-labels and establish pseudo-label correspondences across modalities without relying on any prior annotations. Recently, clustered pseudo-label methods have gained more attention in USL-VI-ReID. However, previous methods fell short of fully exploiting the individual nuances, as they simply utilized a single memory that represented an identity to establish cross-modality correspondences, resulting in ambiguous cross-modality correspondences. To address the problem, we propose a Multi-Memory Matching (MMM) framework for USL-VI-ReID. We first design a Cross-Modality Clustering (CMC) module to generate the pseudo-labels through clustering together both two modality samples. To associate cross-modality clustered pseudo-labels, we design a Multi-Memory Learning and Matching (MMLM) module, ensuring that optimization explicitly focuses on the nuances of individual perspectives and establishes reliable cross-modality correspondences. Finally, we design a Soft Cluster-level Alignment (SCA) module to narrow the modality gap while mitigating the effect of noise pseudo-labels through a soft many-to-many alignment strategy. Extensive experiments on the public SYSU-MM01 and RegDB datasets demonstrate the reliability of the established cross-modality correspondences and the effectiveness of our MMM. The source codes will be released.

Multi-Memory Matching for Unsupervised Visible-Infrared Person Re-Identification

TL;DR

This work tackles unsupervised visible–infrared person re-identification by exposing reliability gaps in cross-modality pseudo-labels and correspondences. It proposes Multi-Memory Matching (MMM), comprising Cross-Modarity Clustering (CMC) to generate joint intra- and inter-modality pseudo-labels, Multi-Memory Learning and Matching (MMLM) to exploit multi-memory representations and a bipartite matching process, and Soft Cluster-level Alignment (SCA) to narrow modality gaps with noise-robust, soft alignments. The approach introduces ARI as a reliability metric and demonstrates state-of-the-art performance on SYSU-MM01 and RegDB, along with extensive ablations and hyper-parameter analyses. The work advances practical USL-VI-ReID by enabling more faithful cross-modality correspondences and providing code for reproducibility.

Abstract

Unsupervised visible-infrared person re-identification (USL-VI-ReID) is a promising yet challenging retrieval task. The key challenges in USL-VI-ReID are to effectively generate pseudo-labels and establish pseudo-label correspondences across modalities without relying on any prior annotations. Recently, clustered pseudo-label methods have gained more attention in USL-VI-ReID. However, previous methods fell short of fully exploiting the individual nuances, as they simply utilized a single memory that represented an identity to establish cross-modality correspondences, resulting in ambiguous cross-modality correspondences. To address the problem, we propose a Multi-Memory Matching (MMM) framework for USL-VI-ReID. We first design a Cross-Modality Clustering (CMC) module to generate the pseudo-labels through clustering together both two modality samples. To associate cross-modality clustered pseudo-labels, we design a Multi-Memory Learning and Matching (MMLM) module, ensuring that optimization explicitly focuses on the nuances of individual perspectives and establishes reliable cross-modality correspondences. Finally, we design a Soft Cluster-level Alignment (SCA) module to narrow the modality gap while mitigating the effect of noise pseudo-labels through a soft many-to-many alignment strategy. Extensive experiments on the public SYSU-MM01 and RegDB datasets demonstrate the reliability of the established cross-modality correspondences and the effectiveness of our MMM. The source codes will be released.
Paper Structure (18 sections, 26 equations, 5 figures, 2 tables)

This paper contains 18 sections, 26 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Comparision with different methods on ARI. The ARI indicates the Adjusted Rand Index, which is a similarity measure between two clusterings. The ALL category represents the ARI values of overall pseudo-labels, composed of visible and infrared pseudo-labels, and serves as a metric for evaluating the reliability of cross-modality correspondences.
  • Figure 2: The pipeline of MMM. Different colors indicate different persons, $\bigcirc$ and $\bigtriangleup$ indicate visible and infrared features. It contains the Cross-Modality Clustering module (Baseline, described in Sec. \ref{['CMC']}) and two key novel components: Multi-Memory Learning and Matching (MMLM, described in Sec. \ref{['MMLM']}) and Soft Cluster-level Alignment (SCA, described in Sec. \ref{['SCA']}).
  • Figure 3: The effect of hyper-parameter $n$, $\lambda_{Intra}$ and $\lambda_{Inter}$ with different values on SYSU-MM01.
  • Figure 4: The intra-identity and inter-identity distances on SYSU-MM01, where $\delta_{i}$ denotes the gap between the intra-identity distance mean and the inter-identity distance mean.
  • Figure 5: The Visualization of the pseudo-labels of the same identity with different modalities.