Multi-Head Adaptive Graph Convolution Network for Sparse Point Cloud-Based Human Activity Recognition
Vincent Gbouna Zakka, Luis J. Manso, Zhuangzhuang Dai
TL;DR
This work tackles privacy-friendly human activity recognition from sparse mmWave radar point clouds by introducing MAK-GCN, which uses a Multi-Head Adaptive Kernel to generate multiple dynamic filters per neighbourhood. The architecture progressively refines local features while preserving global spatial context across five stages, combining a graph-feature extractor, dynamic filtering, and global descriptors to boost discriminative power. Empirical results on MMActivity and MiliPoint show state-of-the-art accuracy (up to 97.54% and 99.12% in ablations) and demonstrate robust performance under real-world robotic deployment, including dark conditions. The approach promises practical impact for privacy-preserving, real-time activity monitoring in home robotics and elder-care scenarios.
Abstract
Human activity recognition is increasingly vital for supporting independent living, particularly for the elderly and those in need of assistance. Domestic service robots with monitoring capabilities can enhance safety and provide essential support. Although image-based methods have advanced considerably in the past decade, their adoption remains limited by concerns over privacy and sensitivity to low-light or dark conditions. As an alternative, millimetre-wave (mmWave) radar can produce point cloud data which is privacy-preserving. However, processing the sparse and noisy point clouds remains a long-standing challenge. While graph-based methods and attention mechanisms show promise, they predominantly rely on "fixed" kernels; kernels that are applied uniformly across all neighbourhoods, highlighting the need for adaptive approaches that can dynamically adjust their kernels to the specific geometry of each local neighbourhood in point cloud data. To overcome this limitation, we introduce an adaptive approach within the graph convolutional framework. Instead of a single shared weight function, our Multi-Head Adaptive Kernel (MAK) module generates multiple dynamic kernels, each capturing different aspects of the local feature space. By progressively refining local features while maintaining global spatial context, our method enables convolution kernels to adapt to varying local features. Experimental results on benchmark datasets confirm the effectiveness of our approach, achieving state-of-the-art performance in human activity recognition. Our source code is made publicly available at: https://github.com/Gbouna/MAK-GCN
