Table of Contents
Fetching ...

Domain-Shared Learning and Gradual Alignment for Unsupervised Domain Adaptation Visible-Infrared Person Re-Identification

Nianchang Huang, Yi Xu, Ruida Xi, Ruida Xi, Qiang Zhang

TL;DR

This work addresses the gap in unsupervised domain adaptation for VI-ReID by proposing DSLGA, a two-stage framework that first reduces inter-domain modality discrepancies through Domain-Shared Learning (DSLS) and Domain-Shared Adversarial Loss (DSAL) plus Cluster Refinement with Multiple Results (CRMR), then mitigates large intra-domain cross-modality gaps with Gradual Alignment Strategy (GAS) using Supplementary Graph Matching (SGM) and Cross-Modality Consistency Constraining (CMCC). A new CMDA-XD testing protocol evaluates cross-dataset VI-ReID transfer across SYSU-MM01, RegDB, and LLCM, and the results show that DSLGA outperforms existing unsupervised and many UDA methods, with competitive performance compared to some supervised approaches. The paper provides detailed ablations demonstrating the effectiveness of DSLS, DSAL, CRMR, SGM, and CMCC, and analyzes key hyperparameters to guide practical deployment. Overall, DSLGA advances real-world VI-ReID by enabling accurate cross-modality re-identification without requiring target-domain annotations, which could significantly improve surveillance and monitoring applications in heterogeneous environments.

Abstract

Recently, Visible-Infrared person Re-Identification (VI-ReID) has achieved remarkable performance on public datasets. However, due to the discrepancies between public datasets and real-world data, most existing VI-ReID algorithms struggle in real-life applications. To address this, we take the initiative to investigate Unsupervised Domain Adaptation Visible-Infrared person Re-Identification (UDA-VI-ReID), aiming to transfer the knowledge learned from the public data to real-world data without compromising accuracy and requiring the annotation of new samples. Specifically, we first analyze two basic challenges in UDA-VI-ReID, i.e., inter-domain modality discrepancies and intra-domain modality discrepancies. Then, we design a novel two-stage model, i.e., Domain-Shared Learning and Gradual Alignment (DSLGA), to handle these discrepancies. In the first pre-training stage, DSLGA introduces a Domain-Shared Learning Strategy (DSLS) to mitigate ineffective pre-training caused by inter-domain modality discrepancies via exploiting shared information between the source and target domains. While, in the second fine-tuning stage, DSLGA designs a Gradual Alignment Strategy (GAS) to handle the cross-modality alignment challenges between visible and infrared data caused by the large intra-domain modality discrepancies through a cluster-to-holistic alignment way. Finally, a new UDA-VI-ReID testing method i.e., CMDA-XD, is constructed for training and testing different UDA-VI-ReID models. A large amount of experiments demonstrate that our method significantly outperforms existing domain adaptation methods for VI-ReID and even some supervised methods under various settings.

Domain-Shared Learning and Gradual Alignment for Unsupervised Domain Adaptation Visible-Infrared Person Re-Identification

TL;DR

This work addresses the gap in unsupervised domain adaptation for VI-ReID by proposing DSLGA, a two-stage framework that first reduces inter-domain modality discrepancies through Domain-Shared Learning (DSLS) and Domain-Shared Adversarial Loss (DSAL) plus Cluster Refinement with Multiple Results (CRMR), then mitigates large intra-domain cross-modality gaps with Gradual Alignment Strategy (GAS) using Supplementary Graph Matching (SGM) and Cross-Modality Consistency Constraining (CMCC). A new CMDA-XD testing protocol evaluates cross-dataset VI-ReID transfer across SYSU-MM01, RegDB, and LLCM, and the results show that DSLGA outperforms existing unsupervised and many UDA methods, with competitive performance compared to some supervised approaches. The paper provides detailed ablations demonstrating the effectiveness of DSLS, DSAL, CRMR, SGM, and CMCC, and analyzes key hyperparameters to guide practical deployment. Overall, DSLGA advances real-world VI-ReID by enabling accurate cross-modality re-identification without requiring target-domain annotations, which could significantly improve surveillance and monitoring applications in heterogeneous environments.

Abstract

Recently, Visible-Infrared person Re-Identification (VI-ReID) has achieved remarkable performance on public datasets. However, due to the discrepancies between public datasets and real-world data, most existing VI-ReID algorithms struggle in real-life applications. To address this, we take the initiative to investigate Unsupervised Domain Adaptation Visible-Infrared person Re-Identification (UDA-VI-ReID), aiming to transfer the knowledge learned from the public data to real-world data without compromising accuracy and requiring the annotation of new samples. Specifically, we first analyze two basic challenges in UDA-VI-ReID, i.e., inter-domain modality discrepancies and intra-domain modality discrepancies. Then, we design a novel two-stage model, i.e., Domain-Shared Learning and Gradual Alignment (DSLGA), to handle these discrepancies. In the first pre-training stage, DSLGA introduces a Domain-Shared Learning Strategy (DSLS) to mitigate ineffective pre-training caused by inter-domain modality discrepancies via exploiting shared information between the source and target domains. While, in the second fine-tuning stage, DSLGA designs a Gradual Alignment Strategy (GAS) to handle the cross-modality alignment challenges between visible and infrared data caused by the large intra-domain modality discrepancies through a cluster-to-holistic alignment way. Finally, a new UDA-VI-ReID testing method i.e., CMDA-XD, is constructed for training and testing different UDA-VI-ReID models. A large amount of experiments demonstrate that our method significantly outperforms existing domain adaptation methods for VI-ReID and even some supervised methods under various settings.

Paper Structure

This paper contains 30 sections, 37 equations, 9 figures, 5 tables.

Figures (9)

  • Figure 1: Comparisons between UDA-ReID and UDA-VI-ReID. (a) The intra-modality domain discrepancies in UDA-ReID. (b) The inter-domain modality discrepancies and intra-domain modality discrepancies in UDA-VI-ReID. Arrows in different colors mean different types of discrepancies
  • Figure 2: Framework of our proposed DSLGA model. It contains two main stages, i.e., pre-training and fine-tuning. In the pre-training stage, a VI-ReID network is trained to achieve the knowledge transfer from the source domain to the target domain. Especially, a domain-shared learning strategy (DSLS) is designed to mitigate the inter-domain modality discrepancies with the aid of the proposed DSAL. As well, a CRMR is proposed to generate intra-modality pseudo labels for the target domain. In the fine-tuning stage, the pre-trained model in the first state is further optimized by deeply exploring the target domain data. Especially, a gradual alignment strategy (GAS) is designed to mitigate the intra-domain modality discrepancies, which first generates cross-modality pseudo labels at the cluster level by an SGM module and then suppresses those incorrect pseudo labels at the holistic level by a CMCC module.
  • Figure 3: Structures of the VI-ReID network and the discriminator. (a) The VI-ReID network. (b) The discriminator. Items in different colors represent different person identities.
  • Figure 4: Illustration of the proposed CRMR module. Samples in different colors represent different person identities.
  • Figure 5: Illustration of the proposed SGM module. It contains three steps: i.e., Intra-modality alignment, Inter-modality alignment and Intra-modality supplementary alignment.
  • ...and 4 more figures