Multi-task deep-learning for sleep event detection and stage classification
Adriana Anido-Alonso, Diego Alvarez-Estevez
TL;DR
The paper tackles automated sleep analysis by jointly detecting sleep stages, EEG arousals, and respiratory events within a single pass over multivariate PSG sequences. It introduces a multi-task deep-learning framework that reframes multi-event detection as a one-shot time-series object-detection problem, using a CNN-LSTM backbone and 1D bounding windows with temporal IOU. Evaluations on SHHS (local) and HMC-ISA (external) show that a three-component loss improves training efficiency and performance, with strong sleep-stage detection but mixed cross-dataset generalization, particularly for arousal and respiratory events. The approach offers flexible input montages and a scalable output design that could streamline clinical PSG analysis and support broader adoption across diverse datasets.
Abstract
Polysomnographic sleep analysis is the standard clinical method to accurately diagnose and treat sleep disorders. It is an intricate process which involves the manual identification, classification, and location of multiple sleep event patterns. This is complex, for which identification of different types of events involves focusing on different subsets of signals, resulting on an iterative time-consuming process entailing several visual analysis passes. In this paper we propose a multi-task deep-learning approach for the simultaneous detection of sleep events and hypnogram construction in one single pass. Taking as reference state-of-the-art methodology for object-detection in the field of Computer Vision, we reformulate the problem for the analysis of multi-variate time sequences, and more specifically for pattern detection in the sleep analysis scenario. We investigate the performance of the resulting method in identifying different assembly combinations of EEG arousals, respiratory events (apneas and hypopneas) and sleep stages, also considering different input signal montage configurations. Furthermore, we evaluate our approach using two independent datasets, assessing true-generalization effects involving local and external validation scenarios. Based on our results, we analyze and discuss our method's capabilities and its potential wide-range applicability across different settings and datasets.
