Enhancing Long-Term Person Re-Identification Using Global, Local Body Part, and Head Streams
Duy Tran Thanh, Yeejin Lee, Byeongkeun Kang
TL;DR
This paper tackles long-term person re-identification by addressing clothes-changing and clothes-consistent scenarios. It introduces the Parts-Aligned and Head (PAH) network, a three-stream architecture consisting of global, local body-part, and head streams, each encoding distinct identity cues and trained with a combination of $L_{id}$, $L_{pair}$, and $L_{psd}$. A pseudo-label-based body-part segmentation head enables implicit part alignment without external annotations, while an explicit head stream leverages facial/head information. Across Celeb-reID, PRCC, and VC-Clothes, PAH-Net shows state-of-the-art performance, with ablations confirming the complementary value of each stream and the effectiveness of adversarial erasing. The work advances practical long-term re-identification for surveillance and autonomous service robots by robustly integrating global and localized cues with pragmatic training strategies.
Abstract
This work addresses the task of long-term person re-identification. Typically, person re-identification assumes that people do not change their clothes, which limits its applications to short-term scenarios. To overcome this limitation, we investigate long-term person re-identification, which considers both clothes-changing and clothes-consistent scenarios. In this paper, we propose a novel framework that effectively learns and utilizes both global and local information. The proposed framework consists of three streams: global, local body part, and head streams. The global and head streams encode identity-relevant information from an entire image and a cropped image of the head region, respectively. Both streams encode the most distinct, less distinct, and average features using the combinations of adversarial erasing, max pooling, and average pooling. The local body part stream extracts identity-related information for each body part, allowing it to be compared with the same body part from another image. Since body part annotations are not available in re-identification datasets, pseudo-labels are generated using clustering. These labels are then utilized to train a body part segmentation head in the local body part stream. The proposed framework is trained by backpropagating the weighted summation of the identity classification loss, the pair-based loss, and the pseudo body part segmentation loss. To demonstrate the effectiveness of the proposed method, we conducted experiments on three publicly available datasets (Celeb-reID, PRCC, and VC-Clothes). The experimental results demonstrate that the proposed method outperforms the previous state-of-the-art method.
