SYNCS: Synthetic Data and Contrastive Self-Supervised Training for Central Sulcus Segmentation
Vladyslav Zalevskyi, Kristoffer Hougaard Madsen
TL;DR
Central sulcus segmentation in adolescent cohorts remains challenging due to high morphological variability and limited labeled data. The authors propose a data-efficient pipeline combining synthetic data generation (SynthSeg) with self-supervised learning (SimCLR) and a multi-task SSL variant to learn cortex-morphology representations and adapt to new cohorts with minimal preprocessing. Results show that synthetic data can improve boundary accuracy (HD) on a different cohort, while SSL pre-training on a larger, diverse dataset enhances Dice scores after fine-tuning; the multi-task approach offers no clear gains. Together, these strategies enable robust CS segmentation and morphometry analysis across cohorts, offering a practical pathway toward scalable, preprocessing-light sulci analysis.
Abstract
Bipolar disorder (BD) and schizophrenia (SZ) are severe mental disorders with profound societal impact. Identifying risk markers early is crucial for understanding disease progression and enabling preventive measures. The Danish High Risk and Resilience Study (VIA) focuses on understanding early disease processes, particularly in children with familial high risk (FHR). Understanding structural brain changes associated with these diseases during early stages is essential for effective interventions. The central sulcus (CS) is a prominent brain landmark related to brain regions involved in motor and sensory processing. Analyzing CS morphology can provide valuable insights into neurodevelopmental abnormalities in the FHR group. However, segmenting the central sulcus (CS) presents challenges due to its variability, especially in adolescents. This study introduces two novel approaches to improve CS segmentation: synthetic data generation to model CS variability and self-supervised pre-training with multi-task learning to adapt models to new cohorts. These methods aim to enhance segmentation performance across diverse populations, eliminating the need for extensive preprocessing.
