Table of Contents
Fetching ...

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID

Lingfeng He, De Cheng, Nannan Wang, Xinbo Gao

TL;DR

The paper tackles unsupervised visible-infrared person re-identification by introducing Modality-Unified Label Transfer (MULT), which enforces both within-modality homogeneous and cross-modality heterogeneous instance-level consistency via learned affinities. It is complemented by Online Cross-memory Label Refinement (OCLR) and Alternative Modality-Invariant Representation Learning (AMIRL), which together stabilize training and enhance cross-modality alignment through memory-bank contrastive learning and online label refinement. The approach derives pseudo-labels that respect fine-grained intra- and inter-modality structure and demonstrates state-of-the-art results on SYSU-MM01 and RegDB without requiring camera labels. The method’s significance lies in its robust cross-modality label learning framework, which can be adapted to other unsupervised cross-domain tasks and improves cross-modality recognition under noisy labeling conditions.

Abstract

Unsupervised visible-infrared person re-identification (USL-VI-ReID) endeavors to retrieve pedestrian images of the same identity from different modalities without annotations. While prior work focuses on establishing cross-modality pseudo-label associations to bridge the modality-gap, they ignore maintaining the instance-level homogeneous and heterogeneous consistency between the feature space and the pseudo-label space, resulting in coarse associations. In response, we introduce a Modality-Unified Label Transfer (MULT) module that simultaneously accounts for both homogeneous and heterogeneous fine-grained instance-level structures, yielding high-quality cross-modality label associations. It models both homogeneous and heterogeneous affinities, leveraging them to quantify the inconsistency between the pseudo-label space and the feature space, subsequently minimizing it. The proposed MULT ensures that the generated pseudo-labels maintain alignment across modalities while upholding structural consistency within intra-modality. Additionally, a straightforward plug-and-play Online Cross-memory Label Refinement (OCLR) module is proposed to further mitigate the side effects of noisy pseudo-labels while simultaneously aligning different modalities, coupled with an Alternative Modality-Invariant Representation Learning (AMIRL) framework. Experiments demonstrate that our proposed method outperforms existing state-of-the-art USL-VI-ReID methods, highlighting the superiority of our MULT in comparison to other cross-modality association methods. Code is available at https://github.com/FranklinLingfeng/code_for_MULT.

Exploring Homogeneous and Heterogeneous Consistent Label Associations for Unsupervised Visible-Infrared Person ReID

TL;DR

The paper tackles unsupervised visible-infrared person re-identification by introducing Modality-Unified Label Transfer (MULT), which enforces both within-modality homogeneous and cross-modality heterogeneous instance-level consistency via learned affinities. It is complemented by Online Cross-memory Label Refinement (OCLR) and Alternative Modality-Invariant Representation Learning (AMIRL), which together stabilize training and enhance cross-modality alignment through memory-bank contrastive learning and online label refinement. The approach derives pseudo-labels that respect fine-grained intra- and inter-modality structure and demonstrates state-of-the-art results on SYSU-MM01 and RegDB without requiring camera labels. The method’s significance lies in its robust cross-modality label learning framework, which can be adapted to other unsupervised cross-domain tasks and improves cross-modality recognition under noisy labeling conditions.

Abstract

Unsupervised visible-infrared person re-identification (USL-VI-ReID) endeavors to retrieve pedestrian images of the same identity from different modalities without annotations. While prior work focuses on establishing cross-modality pseudo-label associations to bridge the modality-gap, they ignore maintaining the instance-level homogeneous and heterogeneous consistency between the feature space and the pseudo-label space, resulting in coarse associations. In response, we introduce a Modality-Unified Label Transfer (MULT) module that simultaneously accounts for both homogeneous and heterogeneous fine-grained instance-level structures, yielding high-quality cross-modality label associations. It models both homogeneous and heterogeneous affinities, leveraging them to quantify the inconsistency between the pseudo-label space and the feature space, subsequently minimizing it. The proposed MULT ensures that the generated pseudo-labels maintain alignment across modalities while upholding structural consistency within intra-modality. Additionally, a straightforward plug-and-play Online Cross-memory Label Refinement (OCLR) module is proposed to further mitigate the side effects of noisy pseudo-labels while simultaneously aligning different modalities, coupled with an Alternative Modality-Invariant Representation Learning (AMIRL) framework. Experiments demonstrate that our proposed method outperforms existing state-of-the-art USL-VI-ReID methods, highlighting the superiority of our MULT in comparison to other cross-modality association methods. Code is available at https://github.com/FranklinLingfeng/code_for_MULT.
Paper Structure (17 sections, 31 equations, 9 figures, 10 tables, 2 algorithms)

This paper contains 17 sections, 31 equations, 9 figures, 10 tables, 2 algorithms.

Figures (9)

  • Figure 1: Illustration of our idea. Different colors denote different modalities, and different shapes denote different identities. The red lines represent higher affinities and the gray lines represent lower affinities. Our Modality-Unified Label Transfer takes into account instance-level structures to establish homogeneous and heterogeneous structurally consistent label associations and generate reliable modality-unified pseudo-labels for network training.
  • Figure 2: Framework of our proposed method. Different colors indicate different modalities. Our method alternates pseudo-label generation (Modality-Unified Label Transfer (MULT (a), described in Sec.\ref{['MULT']})) and network training (including Alternative Modality-Invariant Representation Learning (AMIRL (b), described in Sec.\ref{['MIRL']}) and Online Cross-memory Label Refinement (OCLR (c), described in Sec.\ref{['OCLR']})). MULT provides homogeneous and heterogeneous consistent pseudo-labels as supervision signals. During training, AMIRL leverages memory banks to perform contrastive learning with an alternative scheme and OCLR utilizes predictions from different memories to alleviate the effect of the noisy labels.
  • Figure 3: Illustration of the role of the auxiliary memory. Different shapes denote different identities and different edge colors denote different cross-modality labels.
  • Figure 4: Parameter analysis of $\alpha$ and $\beta$ on SYSU-MM01.
  • Figure 5: Accuracy of intra-modality and cross-modality positive pairs found by pseudo-labels on SYSU-MM01.
  • ...and 4 more figures