PedSleepMAE: Generative Model for Multimodal Pediatric Sleep Signals
Saurav R. Pandey, Aaqib Saeed, Harlin Lee
TL;DR
The paper addresses the lack of pediatric-specific, multimodal sleep modeling by introducing PedSleepMAE, a transformer-based masked autoencoder trained with self-supervised learning on pediatric PSG data. It demonstrates that the learned $7680$-dimensional embeddings encode diagnostic information for sleep staging and apnea-related events, and that the decoder can generate realistic, multi-channel sleep signals as well as impute missing channels. The study further shows that embeddings can separate rare disorders like Prader-Willi syndrome and enable retrieval of representative signals, suggesting practical utilities in retrieval, outlier detection, and biomarker discovery. By leveraging open pediatric datasets, PedSleepMAE provides a scalable, versatile foundation for pediatric sleep informatics with potential clinical and research impact.
Abstract
Pediatric sleep is an important but often overlooked area in health informatics. We present PedSleepMAE, a generative model that fully leverages multimodal pediatric sleep signals including multichannel EEGs, respiratory signals, EOGs and EMG. This masked autoencoder-based model performs comparably to supervised learning models in sleep scoring and in the detection of apnea, hypopnea, EEG arousal and oxygen desaturation. Its embeddings are also shown to capture subtle differences in sleep signals coming from a rare genetic disorder. Furthermore, PedSleepMAE generates realistic signals that can be used for sleep segment retrieval, outlier detection, and missing channel imputation. This is the first general-purpose generative model trained on multiple types of pediatric sleep signals.
