Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

Hongda Liu; Yunfan Liu; Min Ren; Hao Wang; Yunlong Wang; Zhenan Sun

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

Hongda Liu, Yunfan Liu, Min Ren, Hao Wang, Yunlong Wang, Zhenan Sun

TL;DR

This paper tackles the challenge of distinguishing similar actions in skeleton data by introducing ProtoGCN, a graph-based approach that decomposes actions into a mixture of learnable motion prototypes. It combines a Prototype Reconstruction Network with a memory of prototypes, a Motion Topology Enhancement module to enrich joint relationships, and a class-specific contrastive objective to sharpen inter-class separability, all within a GCN framework. The method achieves state-of-the-art performance on NTU RGB+D, NTU RGB+D 120, Kinetics-Skeleton, and FineGYM, and ablations demonstrate the effectiveness of PRN, MTE, and CSCL in producing compact, discriminative representations. The approach offers a practical impact by improving fine-grained action recognition in real-world skeleton datasets, with code released for reproducibility.

Abstract

In skeleton-based action recognition, a key challenge is distinguishing between actions with similar trajectories of joints due to the lack of image-level details in skeletal representations. Recognizing that the differentiation of similar actions relies on subtle motion details in specific body parts, we direct our approach to focus on the fine-grained motion of local skeleton components. To this end, we introduce ProtoGCN, a Graph Convolutional Network (GCN)-based model that breaks down the dynamics of entire skeleton sequences into a combination of learnable prototypes representing core motion patterns of action units. By contrasting the reconstruction of prototypes, ProtoGCN can effectively identify and enhance the discriminative representation of similar actions. Without bells and whistles, ProtoGCN achieves state-of-the-art performance on multiple benchmark datasets, including NTU RGB+D, NTU RGB+D 120, Kinetics-Skeleton, and FineGYM, which demonstrates the effectiveness of the proposed method. The code is available at https://github.com/firework8/ProtoGCN.

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

TL;DR

Abstract

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)