MSA-CNN: A Lightweight Multi-Scale CNN with Attention for Sleep Stage Classification
Stephan Goerttler, Yucheng Wang, Emadeldeen Eldele, Min Wu, Fei He
TL;DR
This work introduces MSA-CNN, a lightweight multi-scale CNN with attention for sleep stage classification on multivariate polysomnography signals. It couples a novel Multi-Scale Module (MSM) with complementary pooling, a global spatial convolution, and a Temporal Context Module (TCM) based on multi-head self-attention to capture contextual dependencies while maintaining small parameter counts. Across three public datasets, the large MSA-CNN outperforms nine state-of-the-art baselines in accuracy and Cohen's kappa, often with an order of magnitude fewer parameters, and ablation studies confirm the contributions of MSM, TCM, and multivariate inputs. A visualization tool provides interpretability by illustrating incoming and outgoing attention, and the results suggest practical potential for deployment in clinical and resource-constrained settings, with future work exploring unsupervised learning and waveform-level explanations.
Abstract
Recent advancements in machine learning-based signal analysis, coupled with open data initiatives, have fuelled efforts in automatic sleep stage classification. Despite the proliferation of classification models, few have prioritised reducing model complexity, which is a crucial factor for practical applications. In this work, we introduce Multi-Scale and Attention Convolutional Neural Network (MSA-CNN), a lightweight architecture featuring as few as ~10,000 parameters. MSA-CNN leverages a novel multi-scale module employing complementary pooling to eliminate redundant filter parameters and dense convolutions. Model complexity is further reduced by separating temporal and spatial feature extraction and using cost-effective global spatial convolutions. This separation of tasks not only reduces model complexity but also mirrors the approach used by human experts in sleep stage scoring. We evaluated both small and large configurations of MSA-CNN against nine state-of-the-art baseline models across three public datasets, treating univariate and multivariate models separately. Our evaluation, based on repeated cross-validation and re-evaluation of all baseline models, demonstrated that the large MSA-CNN outperformed all baseline models on all three datasets in terms of accuracy and Cohen's kappa, despite its significantly reduced parameter count. Lastly, we explored various model variants and conducted an in-depth analysis of the key modules and techniques, providing deeper insights into the underlying mechanisms. The code for our models, baselines, and evaluation procedures is available at https://github.com/sgoerttler/MSA-CNN.
