MAMAF-Net: Motion-Aware and Multi-Attention Fusion Network for Stroke Diagnosis
Aysen Degerli, Pekka Jakala, Juha Pajula, Milla Immonen, Miguel Bordallo Lopez
TL;DR
This paper tackles the challenge of standardized, rapid pre-hospital stroke assessment by moving beyond single-test evaluations to an end-to-end video-based diagnosis using multiple NIHSS examination videos. The authors introduce MAMAF-Net, which combines motion-aware modules to capture mobility cues with a multi-attention fusion mechanism to integrate four NIHSS video streams, followed by 3D convolutions for final classification. On the Stroke-data dataset, which includes stroke, TIA, and healthy controls, MAMAF-Net achieves an AUC of up to 95.33% and a sensitivity of 93.62%, outperforming state-of-the-art baselines and demonstrating the value of non-face-dependent, video-based stroke detection. The work suggests practical deployment in smartphone-based pre-hospital settings and points to future extensions for multi-class discrimination and NIHSS score estimation, broadening the applicability to diverse neurological conditions.
Abstract
Stroke is a major cause of mortality and disability worldwide from which one in four people are in danger of incurring in their lifetime. The pre-hospital stroke assessment plays a vital role in identifying stroke patients accurately to accelerate further examination and treatment in hospitals. Accordingly, the National Institutes of Health Stroke Scale (NIHSS), Cincinnati Pre-hospital Stroke Scale (CPSS) and Face Arm Speed Time (F.A.S.T.) are globally known tests for stroke assessment. However, the validity of these tests is skeptical in the absence of neurologists and access to healthcare may be limited. Therefore, in this study, we propose a motion-aware and multi-attention fusion network (MAMAF-Net) that can detect stroke from multimodal examination videos. Contrary to other studies on stroke detection from video analysis, our study for the first time proposes an end-to-end solution from multiple video recordings of each subject with a dataset encapsulating stroke, transient ischemic attack (TIA), and healthy controls. The proposed MAMAF-Net consists of motion-aware modules to sense the mobility of patients, attention modules to fuse the multi-input video data, and 3D convolutional layers to perform diagnosis from the attention-based extracted features. Experimental results over the collected Stroke-data dataset show that the proposed MAMAF-Net achieves a successful detection of stroke with 93.62% sensitivity and 95.33% AUC score.
