Clothes-Changing Person Re-identification Based On Skeleton Dynamics
Asaf Joseph, Shmuel Peleg
TL;DR
This work tackles Clothes-Changing ReID by discarding appearance information and relying solely on skeleton dynamics. It introduces a spatio-temporal Graph Convolutional Network that processes two parallel skeleton streams (joints and bones) to generate segment-level descriptors, trained with a triplet loss and cosine distance. At test time, it leverages Re-Ranking and Re-Voting across multiple video segments to boost accuracy, achieving state-of-the-art results on the CCVID dataset while preserving privacy. The approach demonstrates that skeletal motion cues provide robust, clothing-invariant identity signatures with practical implications for privacy-aware surveillance and re-identification tasks.
Abstract
Clothes-Changing Person Re-Identification (ReID) aims to recognize the same individual across different videos captured at various times and locations. This task is particularly challenging due to changes in appearance, such as clothing, hairstyle, and accessories. We propose a Clothes-Changing ReID method that uses only skeleton data and does not use appearance features. Traditional ReID methods often depend on appearance features, leading to decreased accuracy when clothing changes. Our approach utilizes a spatio-temporal Graph Convolution Network (GCN) encoder to generate a skeleton-based descriptor for each individual. During testing, we improve accuracy by aggregating predictions from multiple segments of a video clip. Evaluated on the CCVID dataset with several different pose estimation models, our method achieves state-of-the-art performance, offering a robust and efficient solution for Clothes-Changing ReID.
