PGDS: Pose-Guidance Deep Supervision for Mitigating Clothes-Changing in Person Re-Identification
Quoc-Huy Trinh, Nhat-Tan Bui, Dinh-Hieu Hoang, Phuoc-Thao Vo Thi, Hai-Dang Nguyen, Debesh Jha, Ulas Bagci, Ngan Le, Minh-Triet Tran
TL;DR
Pose-Guidance Deep Supervision (PGDS) tackles clothes-changing in person Re-Identification by training a Re-ID backbone under the guidance of a frozen pose encoder through a Pose-to-Human Projection (PHP) module. The approach uses multi-scale projectors to transfer pose knowledge to a SOLIDER-based human encoder, optimized with a triplet loss and a KL-divergence-based guide loss, controlled by $\lambda=0.8$. Empirical results across clothes-changing and clothes-consistent datasets show state-of-the-art gains in clothes-changing scenarios and competitive performance elsewhere, with robust cross-domain transfer. The method preserves inference efficiency since the pose encoder remains frozen during inference, making PGDS practical for real-world surveillance applications. This work provides a solid foundation for further exploring pose-informed supervision in Re-ID and related biometric tasks.
Abstract
Person Re-Identification (Re-ID) task seeks to enhance the tracking of multiple individuals by surveillance cameras. It supports multimodal tasks, including text-based person retrieval and human matching. One of the most significant challenges faced in Re-ID is clothes-changing, where the same person may appear in different outfits. While previous methods have made notable progress in maintaining clothing data consistency and handling clothing change data, they still rely excessively on clothing information, which can limit performance due to the dynamic nature of human appearances. To mitigate this challenge, we propose the Pose-Guidance Deep Supervision (PGDS), an effective framework for learning pose guidance within the Re-ID task. It consists of three modules: a human encoder, a pose encoder, and a Pose-to-Human Projection module (PHP). Our framework guides the human encoder, i.e., the main re-identification model, with pose information from the pose encoder through multiple layers via the knowledge transfer mechanism from the PHP module, helping the human encoder learn body parts information without increasing computation resources in the inference stage. Through extensive experiments, our method surpasses the performance of current state-of-the-art methods, demonstrating its robustness and effectiveness for real-world applications. Our code is available at https://github.com/huyquoctrinh/PGDS.
