Table of Contents
Fetching ...

Mamba-based Deep Learning Approach for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography

Andrew H. Zhang, Alex He-Mo, Richard Fei Yin, Chunlin Li, Yuzhi Tang, Dharmendra Gurve, Veronique van der Horst, Aron S. Buchman, Nasim Montazeri Ghahjaverestan, Maged Goubran, Bo Wang, Andrew S. P. Lim

TL;DR

The study addresses EEG-free sleep staging by leveraging the CHest-and-finger ANNE One wearable in a large clinical cohort. It introduces a Mamba-based RNN ensemble that fuses multi-sensor cardiovascular and motion signals to predict 3-, 4-, and 5-class sleep stages with competitive accuracy. The results show strong 3-class performance (84%), robust 5-class performance (65%), and demonstrated the pivotal role of chest accelerometry, enabling scalable ambulatory sleep assessment without EEG. The approach maintains robustness across age and comorbidity profiles, suggesting practical utility for routine clinical monitoring and home-based sleep diagnostics.

Abstract

Study Objectives: We investigate a Mamba-based deep learning approach for sleep staging on signals from ANNE One (Sibel Health, Evanston, IL), a non-intrusive dual-module wireless wearable system measuring chest electrocardiography (ECG), triaxial accelerometry, and chest temperature, and finger photoplethysmography and finger temperature. Methods: We obtained wearable sensor recordings from 357 adults undergoing concurrent polysomnography (PSG) at a tertiary care sleep lab. Each PSG recording was manually scored and these annotations served as ground truth labels for training and evaluation of our models. PSG and wearable sensor data were automatically aligned using their ECG channels with manual confirmation by visual inspection. We trained a Mamba-based recurrent neural network architecture on these recordings. Ensembling of model variants with similar architectures was performed. Results: After ensembling, the model attains a 3-class (wake, non rapid eye movement [NREM] sleep, rapid eye movement [REM] sleep) balanced accuracy of 84.02%, F1 score of 84.23%, Cohen's $κ$ of 72.89%, and a Matthews correlation coefficient (MCC) score of 73.00%; a 4-class (wake, light NREM [N1/N2], deep NREM [N3], REM) balanced accuracy of 75.30%, F1 score of 74.10%, Cohen's $κ$ of 61.51%, and MCC score of 61.95%; a 5-class (wake, N1, N2, N3, REM) balanced accuracy of 65.11%, F1 score of 66.15%, Cohen's $κ$ of 53.23%, MCC score of 54.38%. Conclusions: Our Mamba-based deep learning model can successfully infer major sleep stages from the ANNE One, a wearable system without electroencephalography (EEG), and can be applied to data from adults attending a tertiary care sleep clinic.

Mamba-based Deep Learning Approach for Sleep Staging on a Wireless Multimodal Wearable System without Electroencephalography

TL;DR

The study addresses EEG-free sleep staging by leveraging the CHest-and-finger ANNE One wearable in a large clinical cohort. It introduces a Mamba-based RNN ensemble that fuses multi-sensor cardiovascular and motion signals to predict 3-, 4-, and 5-class sleep stages with competitive accuracy. The results show strong 3-class performance (84%), robust 5-class performance (65%), and demonstrated the pivotal role of chest accelerometry, enabling scalable ambulatory sleep assessment without EEG. The approach maintains robustness across age and comorbidity profiles, suggesting practical utility for routine clinical monitoring and home-based sleep diagnostics.

Abstract

Study Objectives: We investigate a Mamba-based deep learning approach for sleep staging on signals from ANNE One (Sibel Health, Evanston, IL), a non-intrusive dual-module wireless wearable system measuring chest electrocardiography (ECG), triaxial accelerometry, and chest temperature, and finger photoplethysmography and finger temperature. Methods: We obtained wearable sensor recordings from 357 adults undergoing concurrent polysomnography (PSG) at a tertiary care sleep lab. Each PSG recording was manually scored and these annotations served as ground truth labels for training and evaluation of our models. PSG and wearable sensor data were automatically aligned using their ECG channels with manual confirmation by visual inspection. We trained a Mamba-based recurrent neural network architecture on these recordings. Ensembling of model variants with similar architectures was performed. Results: After ensembling, the model attains a 3-class (wake, non rapid eye movement [NREM] sleep, rapid eye movement [REM] sleep) balanced accuracy of 84.02%, F1 score of 84.23%, Cohen's of 72.89%, and a Matthews correlation coefficient (MCC) score of 73.00%; a 4-class (wake, light NREM [N1/N2], deep NREM [N3], REM) balanced accuracy of 75.30%, F1 score of 74.10%, Cohen's of 61.51%, and MCC score of 61.95%; a 5-class (wake, N1, N2, N3, REM) balanced accuracy of 65.11%, F1 score of 66.15%, Cohen's of 53.23%, MCC score of 54.38%. Conclusions: Our Mamba-based deep learning model can successfully infer major sleep stages from the ANNE One, a wearable system without electroencephalography (EEG), and can be applied to data from adults attending a tertiary care sleep clinic.

Paper Structure

This paper contains 38 sections, 2 equations, 27 figures, 8 tables.

Figures (27)

  • Figure 1: A. The finger and chest modules of ANNE One (Image by Sibel Health sibel). B. Raw physiological signals recorded by ANNE One during a full night sleep recording and its accompanying hypnogram. C. Demographics and sleep characteristics of the subjects used in this study.
  • Figure 2: A. Overview of deep learning pipeline of this study. B. Implementation of the RNN architecture in this study. Batch normalizations precede each of the five linear layers in the MLPs. LeakyReLU is the activation function (see \ref{['sec:detailed-architecture']} for details). C. The inference-time ensembling pipeline.
  • Figure 3: Predicted sleep stages (hypnograms) and class probabilities (hypnodensities) for a typical single full-night recording in the test set for 5-class regular and ensemble models. The hypnogram prediction at each epoch is the class with the highest probability. Probabilities are coloured by ground truth sleep stage: darker to lighter colours represent classes as ordered in the hypnograms from bottom to top. The methods of calculating both hypnodensities can be found in \ref{['sec:detailed-architecture']}. An alternative method of calculating hypnodensities is shown in \ref{['sec:ensemble-mode']} (Figure \ref{['fig:alternate-hypnodensity']}).
  • Figure 4: Test set confusion matrices of ensembled RNN model for 3, 4, and 5 class sleep staging (top); Macro-evaluation metrics for all $n$-class models on the test set (bottom), where the best-performing model for each metric is bolded.
  • Figure 5: Confusion matrices and macro-evaluation metrics of 3-class, 4-class, and 5-class RNN ensemble models on a healthy subset of the test set ($n$ = 11) defined as age < 40 and AHI < 5 and PLMI < 5.
  • ...and 22 more figures