Table of Contents
Fetching ...

Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification

Qizao Wang, Xuelin Qian, Bin Li, Xiangyang Xue, Yanwei Fu

TL;DR

This work tackles cloth-changing person Re-ID by learning robust identity representations without relying on auxiliary annotations. It introduces FIRe^2, a two-module framework: Fine-grained Feature Mining (FFM) discovers fine-grained attributes via per-identity clustering to produce pseudo labels and an attribute-aware loss, and Fine-grained Attribute Recomposition (FAR) augments latent features by recomposing attributes using instance normalization and target attribute statistics. The training objective combines identity, triplet, and attribute-driven losses, while inference can omit the extra modules for efficiency. Empirical results across five cloth-changing benchmarks show state-of-the-art performance, with extensive ablations validating the effectiveness of both FFM and FAR and their contribution to robust, attribute-aware representation learning. The approach demonstrates strong practical potential by avoiding clothing labels and auxiliary modalities, enabling broader applicability in real-world surveillance settings.

Abstract

Cloth-changing person Re-IDentification (Re-ID) is a particularly challenging task, suffering from two limitations of inferior discriminative features and limited training samples. Existing methods mainly leverage auxiliary information to facilitate identity-relevant feature learning, including soft-biometrics features of shapes or gaits, and additional labels of clothing. However, this information may be unavailable in real-world applications. In this paper, we propose a novel FIne-grained Representation and Recomposition (FIRe$^{2}$) framework to tackle both limitations without any auxiliary annotation or data. Specifically, we first design a Fine-grained Feature Mining (FFM) module to separately cluster images of each person. Images with similar so-called fine-grained attributes (e.g., clothes and viewpoints) are encouraged to cluster together. An attribute-aware classification loss is introduced to perform fine-grained learning based on cluster labels, which are not shared among different people, promoting the model to learn identity-relevant features. Furthermore, to take full advantage of fine-grained attributes, we present a Fine-grained Attribute Recomposition (FAR) module by recomposing image features with different attributes in the latent space. It significantly enhances robust feature learning. Extensive experiments demonstrate that FIRe$^{2}$ can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks. The code is available at https://github.com/QizaoWang/FIRe-CCReID.

Exploring Fine-Grained Representation and Recomposition for Cloth-Changing Person Re-Identification

TL;DR

This work tackles cloth-changing person Re-ID by learning robust identity representations without relying on auxiliary annotations. It introduces FIRe^2, a two-module framework: Fine-grained Feature Mining (FFM) discovers fine-grained attributes via per-identity clustering to produce pseudo labels and an attribute-aware loss, and Fine-grained Attribute Recomposition (FAR) augments latent features by recomposing attributes using instance normalization and target attribute statistics. The training objective combines identity, triplet, and attribute-driven losses, while inference can omit the extra modules for efficiency. Empirical results across five cloth-changing benchmarks show state-of-the-art performance, with extensive ablations validating the effectiveness of both FFM and FAR and their contribution to robust, attribute-aware representation learning. The approach demonstrates strong practical potential by avoiding clothing labels and auxiliary modalities, enabling broader applicability in real-world surveillance settings.

Abstract

Cloth-changing person Re-IDentification (Re-ID) is a particularly challenging task, suffering from two limitations of inferior discriminative features and limited training samples. Existing methods mainly leverage auxiliary information to facilitate identity-relevant feature learning, including soft-biometrics features of shapes or gaits, and additional labels of clothing. However, this information may be unavailable in real-world applications. In this paper, we propose a novel FIne-grained Representation and Recomposition (FIRe) framework to tackle both limitations without any auxiliary annotation or data. Specifically, we first design a Fine-grained Feature Mining (FFM) module to separately cluster images of each person. Images with similar so-called fine-grained attributes (e.g., clothes and viewpoints) are encouraged to cluster together. An attribute-aware classification loss is introduced to perform fine-grained learning based on cluster labels, which are not shared among different people, promoting the model to learn identity-relevant features. Furthermore, to take full advantage of fine-grained attributes, we present a Fine-grained Attribute Recomposition (FAR) module by recomposing image features with different attributes in the latent space. It significantly enhances robust feature learning. Extensive experiments demonstrate that FIRe can achieve state-of-the-art performance on five widely-used cloth-changing person Re-ID benchmarks. The code is available at https://github.com/QizaoWang/FIRe-CCReID.
Paper Structure (16 sections, 7 equations, 9 figures, 6 tables, 1 algorithm)

This paper contains 16 sections, 7 equations, 9 figures, 6 tables, 1 algorithm.

Figures (9)

  • Figure 1: Pilot studies to support our motivation. (a) We train a vanilla ResNet-50 resnet with identity labels or clothing labels on the LTCC dataset. "S" and "C" denote the standard and the cloth-changing settings, respectively. Clothing labels bring more improvement, especially in the standard setting. However, it is difficult to define and annotate labels of various clothing styles. (b) When pedestrians change clothes, more fine-grained clues than clothing are needed to determine their identity. (c) We show clustering results of images from the same person. Shared fine-grained attributes (e.g., clothes, viewpoint, and occlusion) can be easily found in each cluster (C1 $\sim$ C9). Images in the green region, while having the same ground-truth clothing labels, can be further divided according to other different fine-grained attributes.
  • Figure 2: Overview of our proposed method. We first perform Fine-grained Feature Mining (FFM) to mine fine-grained attributes of all pedestrians. Then, the attribute-aware classification loss $\mathcal{L}_{attr}$ is introduced to encourage fine-grained learning for discriminative features. By taking full advantage of the explored fine-grained attributes, we further present the Fine-grained Attribute Recomposition (FAR) module as an augmentation to recompose features of each image with various attributes, and identity classification loss $\mathcal{L}_{r}$ is applied to facilitate the learning of robust features.
  • Figure 3: Illustration of fine-grained attribute recomposition. Taking two parts as an example, it first normalizes the input feature to remove its original attribute in the part level, and then restitutes it with new attributes $\left(\mu_{j}^{1} , \sigma_{j}^{1}\right)$ and $\left(\mu_{k}^{2} , \sigma_{k}^{2}\right)$ from different pedestrians.
  • Figure 4: Ablation studies of hyper-parameters. We report mAP results of our method with different values of (a) $\epsilon$, (b) $1 / \tau$, (c) $\lambda_{4}$ on the LTCC dataset. "Standard’' and "Cloth-Changing" mean the standard and cloth-changing settings, respectively.
  • Figure 5: Ablation studies of fine-grained attribute recomposition. We report mAP results with different values of (a) body parts $P$, and (b) attribute recompostion times $K$ on LTCC. "Standard’' and "Cloth-Changing" mean the standard and cloth-changing settings, respectively.
  • ...and 4 more figures