Table of Contents
Fetching ...

BiT-MamSleep: Bidirectional Temporal Mamba for EEG Sleep Staging

Xinliang Zhou, Yuzhe Han, Zhisheng Chen, Chenyu Liu, Yi Ding, Ziyu Jia, Yang Liu

TL;DR

BiT-MamSleep, a novel architecture that integrates the Triple-Resolution CNN for efficient multi-scale feature extraction with the Bidirectional Mamba mechanism, which models both short- and long-term temporal dependencies through bidirectional processing of EEG data, significantly outperforms state-of-the-art methods.

Abstract

In this paper, we address the challenges in automatic sleep stage classification, particularly the high computational cost, inadequate modeling of bidirectional temporal dependencies, and class imbalance issues faced by Transformer-based models. To address these limitations, we propose BiT-MamSleep, a novel architecture that integrates the Triple-Resolution CNN (TRCNN) for efficient multi-scale feature extraction with the Bidirectional Mamba (BiMamba) mechanism, which models both short- and long-term temporal dependencies through bidirectional processing of EEG data. Additionally, BiT-MamSleep incorporates an Adaptive Feature Recalibration (AFR) module and a temporal enhancement block to dynamically refine feature importance, optimizing classification accuracy without increasing computational complexity. To further improve robustness, we apply optimization techniques such as Focal Loss and SMOTE to mitigate class imbalance. Extensive experiments on four public datasets demonstrate that BiT-MamSleep significantly outperforms state-of-the-art methods, particularly in handling long EEG sequences and addressing class imbalance, leading to more accurate and scalable sleep stage classification.

BiT-MamSleep: Bidirectional Temporal Mamba for EEG Sleep Staging

TL;DR

BiT-MamSleep, a novel architecture that integrates the Triple-Resolution CNN for efficient multi-scale feature extraction with the Bidirectional Mamba mechanism, which models both short- and long-term temporal dependencies through bidirectional processing of EEG data, significantly outperforms state-of-the-art methods.

Abstract

In this paper, we address the challenges in automatic sleep stage classification, particularly the high computational cost, inadequate modeling of bidirectional temporal dependencies, and class imbalance issues faced by Transformer-based models. To address these limitations, we propose BiT-MamSleep, a novel architecture that integrates the Triple-Resolution CNN (TRCNN) for efficient multi-scale feature extraction with the Bidirectional Mamba (BiMamba) mechanism, which models both short- and long-term temporal dependencies through bidirectional processing of EEG data. Additionally, BiT-MamSleep incorporates an Adaptive Feature Recalibration (AFR) module and a temporal enhancement block to dynamically refine feature importance, optimizing classification accuracy without increasing computational complexity. To further improve robustness, we apply optimization techniques such as Focal Loss and SMOTE to mitigate class imbalance. Extensive experiments on four public datasets demonstrate that BiT-MamSleep significantly outperforms state-of-the-art methods, particularly in handling long EEG sequences and addressing class imbalance, leading to more accurate and scalable sleep stage classification.

Paper Structure

This paper contains 24 sections, 8 equations, 4 figures, 2 tables, 1 algorithm.

Figures (4)

  • Figure 1: An overview of the BiT-MamSleep architecture, which consists of two main functional components: the feature extraction module and the Mamba mechanism. The feature extraction module extracts relevant features from raw EEG signals. After mamba module which captures and focuses on the most salient information, the classification module, enhanced by custom imbalanced data handling techniques and dynamic learning rate adjustment strategies, outputs the predicted sleep stage.
  • Figure 2: The structure of Feature Extraction modules. Each CNN-unit is represented as (Out Channels, Kernel Size, Padding). The shaded areas represent layers that do not participate in parameter updates (e.g., Dropout layers). These layers do not adjust model weights during the training process.
  • Figure 3: The Mamba Mechanism distinguishes itself from conventional CNNs and RNNs by its ability to efficiently capture long-range temporal dependencies while minimizing computational overhead, a common limitation in traditional models.
  • Figure 4: Ablation study conducted on Sleep-EDF-20 dataset.