High-Performance Inference Graph Convolutional Networks for Skeleton-Based Action Recognition
Junyi Wang, Ziao Li, Bangli Liu, Haibin Cai, Mohamad Saada, Qinggang Meng
TL;DR
The paper tackles real-time skeleton-based action recognition with graph convolutions, where state-of-the-art models rely on complex multi-branch topologies that hinder inference speed. It introduces re-parameterization (HPI-GCN-RP) and over-parameterization (HPI-GCN-OP), plus Rep-TCN to preserve temporal modeling while enabling fast single-branch inference; adjacency matrix fusion further optimizes computation. On NTU-RGB+D 60/120 benchmarks, RP delivers up to $1.5\times$ faster inference with higher accuracy, while OP achieves SOTA-competitive performance with around $5\times$ faster inference, including $K=9$ variants matching or exceeding multi-stream baselines. The results demonstrate strong real-time performance gains with preserved or improved accuracy, and the methods generalize to other backbones, offering practical deployment potential; code is available at github.com/lizaowo/HPI-GCN.
Abstract
Recently, the significant achievements have been made in skeleton-based human action recognition with the emergence of graph convolutional networks (GCNs). However, the state-of-the-art (SOTA) models used for this task focus on constructing more complex higher-order connections between joint nodes to describe skeleton information, which leads to complex inference processes and high computational costs. To address the slow inference speed caused by overly complex model structures, we introduce re-parameterization and over-parameterization techniques to GCNs and propose two novel high-performance inference GCNs, namely HPI-GCN-RP and HPI-GCN-OP. After the completion of model training, model parameters are fixed. HPI-GCN-RP adopts re-parameterization technique to transform high-performance training model into fast inference model through linear transformations, which achieves a higher inference speed with competitive model performance. HPI-GCN-OP further utilizes over-parameterization technique to achieve higher performance improvement by introducing additional inference parameters, albeit with slightly decreased inference speed. The experimental results on the two skeleton-based action recognition datasets demonstrate the effectiveness of our approach. Our HPI-GCN-OP achieves performance comparable to the current SOTA models, with inference speeds five times faster. Specifically, our HPI-GCN-OP achieves an accuracy of 93\% on the cross-subject split of the NTU-RGB+D 60 dataset, and 90.1\% on the cross-subject benchmark of the NTU-RGB+D 120 dataset. Code is available at github.com/lizaowo/HPI-GCN.
