HFGCN:Hypergraph Fusion Graph Convolutional Networks for Skeleton-Based Action Recognition
Pengcheng Dong, Wenbo Wan, Huaxiang Zhang, Shuai Li, Sujuan Hou, Jiande Sun
TL;DR
The paper addresses skeleton-based action recognition by moving beyond fixed pairwise joint graphs to higher-order relationships using hypergraphs. It introduces HFGCN, which builds three hypergraph topologies (body-part groups and distance-from-core) and employs a Hypergraph Attention Module (HAM) for temporal high-order relations and a Hypergraph Convolution Module (HGCM) for channel-wise topology refinement, all powered by a Multi-Scale Temporal Convolution backbone. Extensive experiments on NTU RGB+D and NTU RGB+D 120 demonstrate state-of-the-art performance, with ablations confirming the benefit of integrating multiple hypergraph topologies and modules. The work advances robust, real-time-capable skeleton-based action recognition and points to future optimization of hypergraph construction for even better results.
Abstract
In recent years, action recognition has received much attention and wide application due to its important role in video understanding. Most of the researches on action recognition methods focused on improving the performance via various deep learning methods rather than the classification of skeleton points. The topological modeling between skeleton points and body parts was seldom considered. Although some studies have used a data-driven approach to classify the topology of the skeleton point, the nature of the skeleton point in terms of kinematics has not been taken into consideration. Therefore, in this paper, we draw on the theory of kinematics to adapt the topological relations of the skeleton point and propose a topological relation classification based on body parts and distance from core of body. To synthesize these topological relations for action recognition, we propose a novel Hypergraph Fusion Graph Convolutional Network (HFGCN). In particular, the proposed model is able to focus on the human skeleton points and the different body parts simultaneously, and thus construct the topology, which improves the recognition accuracy obviously. We use a hypergraph to represent the categorical relationships of these skeleton points and incorporate the hypergraph into a graph convolution network to model the higher-order relationships among the skeleton points and enhance the feature representation of the network. In addition, our proposed hypergraph attention module and hypergraph graph convolution module optimize topology modeling in temporal and channel dimensions, respectively, to further enhance the feature representation of the network. We conducted extensive experiments on three widely used datasets.The results validate that our proposed method can achieve the best performance when compared with the state-of-the-art skeleton-based methods.
