Table of Contents
Fetching ...

PedSleepMAE: Generative Model for Multimodal Pediatric Sleep Signals

Saurav R. Pandey, Aaqib Saeed, Harlin Lee

TL;DR

The paper addresses the lack of pediatric-specific, multimodal sleep modeling by introducing PedSleepMAE, a transformer-based masked autoencoder trained with self-supervised learning on pediatric PSG data. It demonstrates that the learned $7680$-dimensional embeddings encode diagnostic information for sleep staging and apnea-related events, and that the decoder can generate realistic, multi-channel sleep signals as well as impute missing channels. The study further shows that embeddings can separate rare disorders like Prader-Willi syndrome and enable retrieval of representative signals, suggesting practical utilities in retrieval, outlier detection, and biomarker discovery. By leveraging open pediatric datasets, PedSleepMAE provides a scalable, versatile foundation for pediatric sleep informatics with potential clinical and research impact.

Abstract

Pediatric sleep is an important but often overlooked area in health informatics. We present PedSleepMAE, a generative model that fully leverages multimodal pediatric sleep signals including multichannel EEGs, respiratory signals, EOGs and EMG. This masked autoencoder-based model performs comparably to supervised learning models in sleep scoring and in the detection of apnea, hypopnea, EEG arousal and oxygen desaturation. Its embeddings are also shown to capture subtle differences in sleep signals coming from a rare genetic disorder. Furthermore, PedSleepMAE generates realistic signals that can be used for sleep segment retrieval, outlier detection, and missing channel imputation. This is the first general-purpose generative model trained on multiple types of pediatric sleep signals.

PedSleepMAE: Generative Model for Multimodal Pediatric Sleep Signals

TL;DR

The paper addresses the lack of pediatric-specific, multimodal sleep modeling by introducing PedSleepMAE, a transformer-based masked autoencoder trained with self-supervised learning on pediatric PSG data. It demonstrates that the learned -dimensional embeddings encode diagnostic information for sleep staging and apnea-related events, and that the decoder can generate realistic, multi-channel sleep signals as well as impute missing channels. The study further shows that embeddings can separate rare disorders like Prader-Willi syndrome and enable retrieval of representative signals, suggesting practical utilities in retrieval, outlier detection, and biomarker discovery. By leveraging open pediatric datasets, PedSleepMAE provides a scalable, versatile foundation for pediatric sleep informatics with potential clinical and research impact.

Abstract

Pediatric sleep is an important but often overlooked area in health informatics. We present PedSleepMAE, a generative model that fully leverages multimodal pediatric sleep signals including multichannel EEGs, respiratory signals, EOGs and EMG. This masked autoencoder-based model performs comparably to supervised learning models in sleep scoring and in the detection of apnea, hypopnea, EEG arousal and oxygen desaturation. Its embeddings are also shown to capture subtle differences in sleep signals coming from a rare genetic disorder. Furthermore, PedSleepMAE generates realistic signals that can be used for sleep segment retrieval, outlier detection, and missing channel imputation. This is the first general-purpose generative model trained on multiple types of pediatric sleep signals.

Paper Structure

This paper contains 16 sections, 14 figures, 3 tables.

Figures (14)

  • Figure 1: An overview of PedSleepMAE framework.
  • Figure 2: UMAP visualizations suggest that PedSleepMAE embeddings are clustered by sleep events. Each point represents 30 seconds of sleep, and each plot corresponds to one PSG. PSGs are selected as follows: (a) one PSG chosen randomly from all patients, (b) one PSG from the top 5 with the highest apnea occurrences, and (c) one PSG from those with 5 to 30 cases of hypopnea. This was to avoid cherry-picking but still ensure there are enough sleep events and show some breadth. See more visualizations in Appendix \ref{['sec:app_umap']}.
  • Figure 3: Sleep scoring is accurate in Wake, N2 and N3, while N1 and REM are often misclassified as N2. Each row adds to 100% in this normalized confusion matrix.
  • Figure 4: Comparison of silhouette scores with 95% confidence intervals for PWS/non-PWS and randomly clusters. Higher value is better clustering with +1 being perfect.
  • Figure 5: Pairwise Euclidean distance of the embeddings is highly correlated ($\rho=0.93$) with pairwise distance of the generated signals. Each point corresponds to a pair of 30s sleep segments. Data from one random patient.
  • ...and 9 more figures