A Novel Transformer-Based Self-Supervised Learning Method to Enhance Photoplethysmogram Signal Artifact Detection
Thanh-Dung Le, Clara Macabiau, Kévin Albert, Philippe Jouvet, Rita Noumeir
TL;DR
This work tackles the challenge of detecting motion artifacts in pediatric PPG signals under limited labeled data. It introduces a two-stage framework that pre-trains a Transformer encoder on unlabeled PPG data using self-supervised tasks (Masking, Contrastive learning, and DINO) and then fine-tunes on a small labeled set, aiming to maximize representation quality with few annotations. A key contribution is the Smooth InfoNCE contrastive loss, designed to stabilize training and improve convergence in data-scarce settings. Empirical results show SSL-augmented Transformers substantially outperform fully supervised baselines, with contrastive and DINO SSL approaches delivering the strongest gains, demonstrating practical potential for robust artifact detection in PICU environments with limited annotations. The findings support broader adoption of SSL in clinical signal processing, enabling more reliable downstream decisions when annotated data are scarce.
Abstract
Recent research at CHU Sainte Justine's Pediatric Critical Care Unit (PICU) has revealed that traditional machine learning methods, such as semi-supervised label propagation and K-nearest neighbors, outperform Transformer-based models in artifact detection from PPG signals, mainly when data is limited. This study addresses the underutilization of abundant unlabeled data by employing self-supervised learning (SSL) to extract latent features from these data, followed by fine-tuning on labeled data. Our experiments demonstrate that SSL significantly enhances the Transformer model's ability to learn representations, improving its robustness in artifact classification tasks. Among various SSL techniques, including masking, contrastive learning, and DINO (self-distillation with no labels)-contrastive learning exhibited the most stable and superior performance in small PPG datasets. Further, we delve into optimizing contrastive loss functions, which are crucial for contrastive SSL. Inspired by InfoNCE, we introduce a novel contrastive loss function that facilitates smoother training and better convergence, thereby enhancing performance in artifact classification. In summary, this study establishes the efficacy of SSL in leveraging unlabeled data, particularly in enhancing the capabilities of the Transformer model. This approach holds promise for broader applications in PICU environments, where annotated data is often limited.
