Motif Guided Graph Transformer with Combinatorial Skeleton Prototype Learning for Skeleton-Based Person Re-Identification

Haocong Rao; Chunyan Miao

Motif Guided Graph Transformer with Combinatorial Skeleton Prototype Learning for Skeleton-Based Person Re-Identification

Haocong Rao, Chunyan Miao

TL;DR

This work tackles skeleton-based person re-identification by addressing the limitation of learning from all-joint relations and using only average features. It introduces MoCos, comprising a Motif Guided Graph Transformer (MGT) that uses Hierarchical Structural Motifs and Gait Collaborative Motifs to capture multi-order structural and gait-related joint relations, and Combinatorial Skeleton Prototype Learning (CSP) to form diverse sub-skeleton and sub-tracklet representations that are contrasted with identity prototypes. The approach demonstrates significant performance gains over state-of-the-art methods on multiple benchmarks and proves generality to RGB-estimated skeletons and unsupervised scenarios. Overall, MoCos advances skeleton-based re-ID by jointly modeling structure-aware and gait-aware relations and by exploiting rich combinatorial patterns for robust, discriminative representations.

Abstract

Person re-identification (re-ID) via 3D skeleton data is a challenging task with significant value in many scenarios. Existing skeleton-based methods typically assume virtual motion relations between all joints, and adopt average joint or sequence representations for learning. However, they rarely explore key body structure and motion such as gait to focus on more important body joints or limbs, while lacking the ability to fully mine valuable spatial-temporal sub-patterns of skeletons to enhance model learning. This paper presents a generic Motif guided graph transformer with Combinatorial skeleton prototype learning (MoCos) that exploits structure-specific and gait-related body relations as well as combinatorial features of skeleton graphs to learn effective skeleton representations for person re-ID. In particular, motivated by the locality within joints' structure and the body-component collaboration in gait, we first propose the motif guided graph transformer (MGT) that incorporates hierarchical structural motifs and gait collaborative motifs, which simultaneously focuses on multi-order local joint correlations and key cooperative body parts to enhance skeleton relation learning. Then, we devise the combinatorial skeleton prototype learning (CSP) that leverages random spatial-temporal combinations of joint nodes and skeleton graphs to generate diverse sub-skeleton and sub-tracklet representations, which are contrasted with the most representative features (prototypes) of each identity to learn class-related semantics and discriminative skeleton representations. Extensive experiments validate the superior performance of MoCos over existing state-of-the-art models. We further show its generality under RGB-estimated skeletons, different graph modeling, and unsupervised scenarios.

Motif Guided Graph Transformer with Combinatorial Skeleton Prototype Learning for Skeleton-Based Person Re-Identification

TL;DR

Abstract

Motif Guided Graph Transformer with Combinatorial Skeleton Prototype Learning for Skeleton-Based Person Re-Identification

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)