PSG-MAE: Robust Multitask Sleep Event Monitoring using Multichannel PSG Reconstruction and Inter-channel Contrastive Learning

Yifei Wang; Qi Liu; Fuli Min; Honghao Wang

PSG-MAE: Robust Multitask Sleep Event Monitoring using Multichannel PSG Reconstruction and Inter-channel Contrastive Learning

Yifei Wang, Qi Liu, Fuli Min, Honghao Wang

TL;DR

PSG-MAE addresses data scarcity and cross-dataset generalization in automated sleep monitoring by pre-training a robust PSG encoder via self-supervised learning. It introduces complementary masking across channels, a channel-level reconstruction loss combining cosine similarity with MSE, and inter-channel contrastive learning to capture temporal and inter-channel relationships. When fine-tuned with downstream feature decomposers, the pre-trained encoder achieves strong sleep staging and obstructive sleep apnea detection performance, demonstrating robustness across datasets. This framework enables multitask sleep assessment from multichannel PSG data using unlabeled records, offering a scalable approach for comprehensive sleep analysis.

Abstract

Polysomnography (PSG) signals are essential for studying sleep processes and diagnosing sleep disorders. Analyzing PSG data through deep neural networks (DNNs) for automated sleep monitoring has become increasingly feasible. However, the limited availability of datasets for certain sleep events often leads to DNNs focusing on a single task with a single-sourced training dataset. As a result, these models struggle to transfer to new sleep events and lack robustness when applied to new datasets. To address these challenges, we propose PSG-MAE, a mask autoencoder (MAE) based pre-training framework. By performing self-supervised learning on a large volume of unlabeled PSG data, PSG-MAE develops a robust feature extraction network that can be broadly applied to various sleep event monitoring tasks. Unlike conventional MAEs, PSG-MAE generates complementary masks across PSG channels, integrates a multichannel signal reconstruction method, and employs a self-supervised inter-channel contrastive learning (ICCL) strategy. This approach enables the encoder to capture temporal features from each channel while simultaneously learning latent relationships between channels, thereby enhancing the utilization of multichannel information. Experimental results show that PSG-MAE effectively captures both temporal details and inter-channel information from PSG signals. When the encoder pre-trained through PSG-MAE is fine-tuned with downstream feature decomposition networks, it achieves an accuracy of 83.7% for sleep staging and 90.45% for detecting obstructive sleep apnea, which highlights the framework's robustness and broad applicability.

PSG-MAE: Robust Multitask Sleep Event Monitoring using Multichannel PSG Reconstruction and Inter-channel Contrastive Learning

TL;DR

Abstract

PSG-MAE: Robust Multitask Sleep Event Monitoring using Multichannel PSG Reconstruction and Inter-channel Contrastive Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)