GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

Lei Jiang; Weixin Yang; Xin Zhang; Hao Ni

GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

Lei Jiang, Weixin Yang, Xin Zhang, Hao Ni

TL;DR

The paper tackles skeleton-based action recognition by enhancing temporal modeling in graph-based architectures. It introduces the G-Dev layer based on path development on temporal graphs and integrates it into a GCN-DevLSTM network, enabling effective temporal feature extraction while reducing time-dimension. Experiments on Chalearn2013, NTU-60, and NTU-120 demonstrate state-of-the-art accuracy and robustness to irregular sampling and missing data, with a plug-and-play design for different GCN backbones. The work provides a generic temporal-graph module with practical code release, offering a versatile approach for SAR and related sequential graph data tasks.

Abstract

Skeleton-based action recognition (SAR) in videos is an important but challenging task in computer vision. The recent state-of-the-art (SOTA) models for SAR are primarily based on graph convolutional neural networks (GCNs), which are powerful in extracting the spatial information of skeleton data. However, it is yet clear that such GCN-based models can effectively capture the temporal dynamics of human action sequences. To this end, we propose the G-Dev layer, which exploits the path development -- a principled and parsimonious representation for sequential data by leveraging the Lie group structure. By integrating the G-Dev layer, the hybrid G-DevLSTM module enhances the traditional LSTM to reduce the time dimension while retaining high-frequency information. It can be conveniently applied to any temporal graph data, complementing existing advanced GCN-based models. Our empirical studies on the NTU60, NTU120 and Chalearn2013 datasets demonstrate that our proposed GCN-DevLSTM network consistently improves the strong GCN baseline models and achieves SOTA results with superior robustness in SAR tasks. The code is available at https://github.com/DeepIntoStreams/GCN-DevLSTM.

GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

TL;DR

Abstract

Paper Structure (26 sections, 11 equations, 5 figures, 9 tables)

This paper contains 26 sections, 11 equations, 5 figures, 9 tables.

Introduction
Related Work
Skeleton-based Action Recognition
Path signature & Path Development
G-Dev Layer: Development layer of a temporal graph
The development of a Path
Path Development Layer
G-Dev Layer: Apply path development layer to a temporal graph
GCN-DevLSTM Network for SAR
GCN Module
G-DevLSTM Module
Numerical Experiments
Fusion results from different data streams
Comparison with state-of-the-art methods
Comparison with signature-based models
...and 11 more sections

Figures (5)

Figure 1: The work flow of G-Dev layer.
Figure 2: (a) The pipeline of our proposed approach consists of $N$ blocks, with each block containing a GCN module and a DevLSTM module. (b) The detail of the DevLSTM module.
Figure 3: Robustness analysis on NTU60 X-sub benchmark.
Figure 4: Dual Graph. Left side is the original skeleton representation in NTU dataset. The right side is its dual graph representation. Joint $V_{1-2}$ in the dual graph is the bone $B_{12}$ connecting joint $V_{1}$ and $V_{2}$ in original graph.
Figure 5: Three GCN Modules used in this paper. The subfigures from left to right represent the CTR-GC module, Adaptive graph convolution module, and the fixed graph, respectively.

Theorems & Definitions (4)

Definition 3.1: Path Development
Example 1: Linear path
Definition 3.2: Path development layer
Definition 3.3: G-Dev Sequence layer

GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

TL;DR

Abstract

GCN-DevLSTM: Path Development for Skeleton-Based Action Recognition

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (4)