Masked Attribute Description Embedding for Cloth-Changing Person Re-identification
Chunlei Peng, Boyu Wang, Decheng Liu, Nannan Wang, Ruimin Hu, Xinbo Gao
TL;DR
This paper tackles cloth-changing person re-identification by leveraging cloth-insensitive information derived from editable attribute descriptions. It introduces Masked Attribute Description Embedding (MADE), which masks cloth-related attributes extracted by SOLIDER and embeds the resulting masked descriptor into a Transformer-based backbone to fuse with image features across multiple levels. The method is trained with a combination of cross-entropy and triplet losses and evaluated on four benchmarks (PRCC, LTCC, Celeb-reID-light, LaST), where it achieves state-of-the-art results and shows robustness to attribute-detection noise. Overall, MADE demonstrates that editable, cloth-irrelevant attribute information can significantly enhance cloth-changing ReID while avoiding complex multi-modal encoders, with potential for extension to other cross-modality tasks.
Abstract
Cloth-changing person re-identification (CC-ReID) aims to match persons who change clothes over long periods. The key challenge in CC-ReID is to extract clothing-independent features, such as face, hairstyle, body shape, and gait. Current research mainly focuses on modeling body shape using multi-modal biological features (such as silhouettes and sketches). However, it does not fully leverage the personal description information hidden in the original RGB image. Considering that there are certain attribute descriptions which remain unchanged after the changing of cloth, we propose a Masked Attribute Description Embedding (MADE) method that unifies personal visual appearance and attribute description for CC-ReID. Specifically, handling variable clothing-sensitive information, such as color and type, is challenging for effective modeling. To address this, we mask the clothing and color information in the personal attribute description extracted through an attribute detection model. The masked attribute description is then connected and embedded into Transformer blocks at various levels, fusing it with the low-level to high-level features of the image. This approach compels the model to discard clothing information. Experiments are conducted on several CC-ReID benchmarks, including PRCC, LTCC, Celeb-reID-light, and LaST. Results demonstrate that MADE effectively utilizes attribute description, enhancing cloth-changing person re-identification performance, and compares favorably with state-of-the-art methods. The code is available at https://github.com/moon-wh/MADE.
