PSDNorm: Test-Time Temporal Normalization for Deep Learning in Sleep Staging
Théo Gnassounou, Antoine Collas, Rémi Flamary, Alexandre Gramfort
TL;DR
PSDNorm tackles distribution shifts in sleep staging caused by subject- and device-level variability by introducing a test-time, PSD-based normalization layer that leverages temporal correlations. It combines PSD estimation via Welch, a running Riemannian barycenter on the PSDs, and an $F$-Monge mapping to align intermediate representations to a geodesic barycenter, effectively whitening and recoloring feature maps in the frequency domain. The method generalizes InstanceNorm (recoverable at $F=1$ with identity recoloring) and acts as a drop-in layer that performs test-time domain adaptation without re-training. Large-scale experiments across 10 sleep datasets with about $10^4$ subjects demonstrate state-of-the-art performance and improved data efficiency, with PSDNorm ranking top in most settings and showing robustness across architectures like U-Sleep and CNNTransformer. PSDNorm thus offers a practical, data-efficient solution for domain shift in physiological signals and holds promise for broader biomedical applications beyond sleep staging.
Abstract
Distribution shift poses a significant challenge in machine learning, particularly in biomedical applications using data collected across different subjects, institutions, and recording devices, such as sleep data. While existing normalization layers, BatchNorm, LayerNorm and InstanceNorm, help mitigate distribution shifts, when applied over the time dimension they ignore the dependencies and auto-correlation inherent to the vector coefficients they normalize. In this paper, we propose PSDNorm that leverages Monge mapping and temporal context to normalize feature maps in deep learning models for signals. Notably, the proposed method operates as a test-time domain adaptation technique, addressing distribution shifts without additional training. Evaluations with architectures based on U-Net or transformer backbones trained on 10K subjects across 10 datasets, show that PSDNorm achieves state-of-the-art performance on unseen left-out datasets while being 4-times more data-efficient than BatchNorm.
