Table of Contents
Fetching ...

Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning

Andong Lu, Chenglong Li, Tianrui Zha, Jin Tang, Xiaofeng Wang, Bin Luo

TL;DR

This paper tackles nighttime person ReID by proposing CENet, a parallel architecture that jointly performs image relighting and ReID through a shared encoder, a ReID transformer backbone, and a Transformer-based relighting branch. It introduces multilevel feature interaction with low-level parameter sharing and a high-level feature distillation loss $L_{FD}$ to align semantic representations across tasks, while employing a multi-domain learning strategy that leverages a large synthetic dataset Syn_Dark and real nighttime data Night600. The approach yields state-of-the-art results on Night600, RGBNT201_rgb, and Syn_Dark, with notable gains in mAP and Rank-1, and demonstrates efficiency by discarding the relighting branch at inference. The work also provides a large synthetic nighttime ReID dataset and extensive ablations, validating the effectiveness of parallel design, domain-aware training, and cross-task collaboration for robust nighttime ReID.

Abstract

Prevalent nighttime person re-identification (ReID) methods typically combine image relighting and ReID networks in a sequential manner. However, their performance (recognition accuracy) is limited by the quality of relighting images and insufficient collaboration between image relighting and ReID tasks. To handle these problems, we propose a novel Collaborative Enhancement Network called CENet, which performs the multilevel feature interactions in a parallel framework, for nighttime person ReID. In particular, the designed parallel structure of CENet can not only avoid the impact of the quality of relighting images on ReID performance, but also allow us to mine the collaborative relations between image relighting and person ReID tasks. To this end, we integrate the multilevel feature interactions in CENet, where we first share the Transformer encoder to build the low-level feature interaction, and then perform the feature distillation that transfers the high-level features from image relighting to ReID, thereby alleviating the severe image degradation issue caused by the nighttime scenario while avoiding the impact of relighting images. In addition, the sizes of existing real-world nighttime person ReID datasets are limited, and large-scale synthetic ones exhibit substantial domain gaps with real-world data. To leverage both small-scale real-world and large-scale synthetic training data, we develop a multi-domain learning algorithm, which alternately utilizes both kinds of data to reduce the inter-domain difference in training procedure. Extensive experiments on two real nighttime datasets, \textit{Night600} and \textit{RGBNT201$_{rgb}$}, and a synthetic nighttime ReID dataset are conducted to validate the effectiveness of CENet. We release the code and synthetic dataset at: \hyperlink{https://github.com/Alexadlu/CENet}{\color{red} https://github.com/Alexadlu/CENet}.

Nighttime Person Re-Identification via Collaborative Enhancement Network with Multi-domain Learning

TL;DR

This paper tackles nighttime person ReID by proposing CENet, a parallel architecture that jointly performs image relighting and ReID through a shared encoder, a ReID transformer backbone, and a Transformer-based relighting branch. It introduces multilevel feature interaction with low-level parameter sharing and a high-level feature distillation loss to align semantic representations across tasks, while employing a multi-domain learning strategy that leverages a large synthetic dataset Syn_Dark and real nighttime data Night600. The approach yields state-of-the-art results on Night600, RGBNT201_rgb, and Syn_Dark, with notable gains in mAP and Rank-1, and demonstrates efficiency by discarding the relighting branch at inference. The work also provides a large synthetic nighttime ReID dataset and extensive ablations, validating the effectiveness of parallel design, domain-aware training, and cross-task collaboration for robust nighttime ReID.

Abstract

Prevalent nighttime person re-identification (ReID) methods typically combine image relighting and ReID networks in a sequential manner. However, their performance (recognition accuracy) is limited by the quality of relighting images and insufficient collaboration between image relighting and ReID tasks. To handle these problems, we propose a novel Collaborative Enhancement Network called CENet, which performs the multilevel feature interactions in a parallel framework, for nighttime person ReID. In particular, the designed parallel structure of CENet can not only avoid the impact of the quality of relighting images on ReID performance, but also allow us to mine the collaborative relations between image relighting and person ReID tasks. To this end, we integrate the multilevel feature interactions in CENet, where we first share the Transformer encoder to build the low-level feature interaction, and then perform the feature distillation that transfers the high-level features from image relighting to ReID, thereby alleviating the severe image degradation issue caused by the nighttime scenario while avoiding the impact of relighting images. In addition, the sizes of existing real-world nighttime person ReID datasets are limited, and large-scale synthetic ones exhibit substantial domain gaps with real-world data. To leverage both small-scale real-world and large-scale synthetic training data, we develop a multi-domain learning algorithm, which alternately utilizes both kinds of data to reduce the inter-domain difference in training procedure. Extensive experiments on two real nighttime datasets, \textit{Night600} and \textit{RGBNT201}, and a synthetic nighttime ReID dataset are conducted to validate the effectiveness of CENet. We release the code and synthetic dataset at: \hyperlink{https://github.com/Alexadlu/CENet}{\color{red} https://github.com/Alexadlu/CENet}.
Paper Structure (16 sections, 15 equations, 8 figures, 10 tables)

This paper contains 16 sections, 15 equations, 8 figures, 10 tables.

Figures (8)

  • Figure 1: Comparison of three nighttime person ReID strategies. (a) uses Zero_DCE DCE_2020 for low-light image pre-processing and TransReID TransReID_2021 for recognition. (b) represents the current state-of-the-art sequential end-to-end nighttime ReID method IDF Lu2023night600 with. (c) introduces our proposed parallel end-to-end method, which outperforms the other strategies. (b) and (c) have consistent training strategy and training data to accurately reflect advantages of the parallel framework. The Rank values of the three strategies are evaluated on the same nighttime ReID dataset Night600Lu2023night600
  • Figure 2: The architecture of the Collaborative Enhancement Network (CENet) for nighttime person ReID. CENet consists of three parts: the shared encoder, the ReID subnet and the Relighting subnet. In addition, the dashed part will be removed during the inference phase.
  • Figure 3: The illustration details the transformer decoder, where $\bigoplus$ represents feature summation.
  • Figure 4: Comparison of real and synthetic domain data samples.
  • Figure 5: Overview of the proposed multi-domain learning approach.
  • ...and 3 more figures