SDFA: Structure Aware Discriminative Feature Aggregation for Efficient Human Fall Detection in Video

Sania Zahan; Ghulam Mubashar Hassan; Ajmal Mian

SDFA: Structure Aware Discriminative Feature Aggregation for Efficient Human Fall Detection in Video

Sania Zahan, Ghulam Mubashar Hassan, Ajmal Mian

TL;DR

SDFA addresses privacy-conscious fall detection by using 2D skeletons extracted from low-resolution video and a lightweight graph-based architecture. The method combines joint and motion streams in a shared space and leverages a Spatial Graph Convolutional Network with a learnable adjacency plus Separable Temporal Convolutions, augmented by randomized spatio-temporal masking and early fusion. Across five large-scale datasets, SDFA delivers competitive accuracy with substantially lower FLOPS and fewer parameters than prior methods, enabling real-time deployment on low-cost cameras without sacrificing privacy. The work demonstrates strong generalization and practical impact for smart healthcare monitoring in homes and care facilities. Key innovations include adaptive adjacency learning, efficient temporal modeling, and robust regularization to handle diverse real-world scenarios.

Abstract

Older people are susceptible to fall due to instability in posture and deteriorating health. Immediate access to medical support can greatly reduce repercussions. Hence, there is an increasing interest in automated fall detection, often incorporated into a smart healthcare system to provide better monitoring. Existing systems focus on wearable devices which are inconvenient or video monitoring which has privacy concerns. Moreover, these systems provide a limited perspective of their generalization ability as they are tested on datasets containing few activities that have wide disparity in the action space and are easy to differentiate. Complex daily life scenarios pose much greater challenges with activities that overlap in action spaces due to similar posture or motion. To overcome these limitations, we propose a fall detection model, coined SDFA, based on human skeletons extracted from low-resolution videos. The use of skeleton data ensures privacy and low-resolution videos ensures low hardware and computational cost. Our model captures discriminative structural displacements and motion trends using unified joint and motion features projected onto a shared high dimensional space. Particularly, the use of separable convolution combined with a powerful GCN architecture provides improved performance. Extensive experiments on five large-scale datasets with a wide range of evaluation settings show that our model achieves competitive performance with extremely low computational complexity and runs faster than existing models.

SDFA: Structure Aware Discriminative Feature Aggregation for Efficient Human Fall Detection in Video

TL;DR

Abstract

SDFA: Structure Aware Discriminative Feature Aggregation for Efficient Human Fall Detection in Video

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)