AbsoluteNet: A Deep Learning Neural Network to Classify Cerebral Hemodynamic Responses of Auditory Processing
Behtom Adeli, John Mclinden, Pankaj Pandey, Ming Shao, Yalda Shahriari
TL;DR
This work introduces AbsoluteNet, a dual-stream CNN designed to classify single-trial fNIRS hemodynamic responses to auditory stimuli. By separately extracting spatial-temporal and temporal-spatial features and fusing them through a separable convolution block, and by employing symmetrical activation functions, the model achieves state-of-the-art performance on an auditory oddball dataset, reaching $87.0\%$ accuracy, $84.81\%$ sensitivity, and $89.21\%$ specificity with concatenated $HbO_2$ and $HbR$ signals. An extensive ablation study confirms the importance of dual streams, fusion blocks, and the activation strategy, while a GA-based hyperparameter search further enhances performance. The results highlight the potential of spatio-temporal feature aggregation in fNIRS-based neural decoding and suggest avenues for multimodal extensions and larger-scale validations.
Abstract
In recent years, deep learning (DL) approaches have demonstrated promising results in decoding hemodynamic responses captured by functional near-infrared spectroscopy (fNIRS), particularly in the context of brain-computer interface (BCI) applications. This work introduces AbsoluteNet, a novel deep learning architecture designed to classify auditory event-related responses recorded using fNIRS. The proposed network is built upon principles of spatio-temporal convolution and customized activation functions. Our model was compared against several models, namely fNIRSNET, MDNN, DeepConvNet, and ShallowConvNet. The results showed that AbsoluteNet outperforms existing models, reaching 87.0% accuracy, 84.8% sensitivity, and 89.2% specificity in binary classification, surpassing fNIRSNET, the second-best model, by 3.8% in accuracy. These findings underscore the effectiveness of our proposed deep learning model in decoding hemodynamic responses related to auditory processing and highlight the importance of spatio-temporal feature aggregation and customized activation functions to better fit fNIRS dynamics.
