Table of Contents
Fetching ...

Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

Hongjun Wang, Jiyuan Chen, Zhengwei Yin, Xuan Song, Yinqiang Zheng

TL;DR

This paper empirically and experimentally demonstrates that completely eliminating or fully retaining clothing features is detrimental to the CC-ReID task, and introduces a novel module called Diverse Norm, which expands personal features into orthogonal spaces and employs channel attention to separate clothing and identity features.

Abstract

Cloth-Changing Person Re-Identification (CC-ReID) involves recognizing individuals in images regardless of clothing status. In this paper, we empirically and experimentally demonstrate that completely eliminating or fully retaining clothing features is detrimental to the task. Existing work, either relying on clothing labels, silhouettes, or other auxiliary data, fundamentally aim to balance the learning of clothing and identity features. However, we practically find that achieving this balance is challenging and nuanced. In this study, we introduce a novel module called Diverse Norm, which expands personal features into orthogonal spaces and employs channel attention to separate clothing and identity features. A sample re-weighting optimization strategy is also introduced to guarantee the opposite optimization direction. Diverse Norm presents a simple yet effective approach that does not require additional data. Furthermore, Diverse Norm can be seamlessly integrated ResNet50 and significantly outperforms the state-of-the-art methods.

Learning to Balance: Diverse Normalization for Cloth-Changing Person Re-Identification

TL;DR

This paper empirically and experimentally demonstrates that completely eliminating or fully retaining clothing features is detrimental to the CC-ReID task, and introduces a novel module called Diverse Norm, which expands personal features into orthogonal spaces and employs channel attention to separate clothing and identity features.

Abstract

Cloth-Changing Person Re-Identification (CC-ReID) involves recognizing individuals in images regardless of clothing status. In this paper, we empirically and experimentally demonstrate that completely eliminating or fully retaining clothing features is detrimental to the task. Existing work, either relying on clothing labels, silhouettes, or other auxiliary data, fundamentally aim to balance the learning of clothing and identity features. However, we practically find that achieving this balance is challenging and nuanced. In this study, we introduce a novel module called Diverse Norm, which expands personal features into orthogonal spaces and employs channel attention to separate clothing and identity features. A sample re-weighting optimization strategy is also introduced to guarantee the opposite optimization direction. Diverse Norm presents a simple yet effective approach that does not require additional data. Furthermore, Diverse Norm can be seamlessly integrated ResNet50 and significantly outperforms the state-of-the-art methods.
Paper Structure (7 sections, 4 equations, 6 figures, 3 tables)

This paper contains 7 sections, 4 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: A trade-off exists between maintaining clothing consistency and clothing changes features in CC-ReID. Traditionally, the model must acquire two distinct sets of features: one focused on clothing features (such as garments and trousers) and another on clothing-irrelevant attributes (like face and body shape) to effectively handle gallery with same person involving cloth consistency and cloth changing, respectively. Finally, either clothes or ID-invariant features will be used to match galleries images of the same person with different clothing status.
  • Figure 2: The relation between accuracy and strength of removing clothes features in CAL gu2022clothes. We found that completely removing clothing features is not beneficial; instead, it can simultaneously disrupt scenarios where clothing is not changed. However, if too many clothes features are retained, it will not be good for the clothes changing scenes.
  • Figure 3: Architecture Overview of Our Method. In Fig. (a), we first apply whitening to the features extracted by the backbone network and then utilize channel attention to separate clothing and identity features. To effectively distinguish between clothes and identity features, we constructed two classifiers and employed sample reweighting to achieve the concept selection. Specifically, as shown in Fig. (b), when an individual is in a dimly lit area wearing dark clothing, making recognition based solely on clothing difficult, the identity branch increases its weight. Conversely, as depicted in Fig. (c), when a person is moving quickly, causing facial motion blur, but wearing distinct plaid clothing, the apparel branch increases its weight.
  • Figure 4: Comparing the effects of training ResNet50 with and without Diverse Norm on the LTCC dataset.
  • Figure 5: The connection between model performance and the number of clothes on LTCC.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Definition 1