CRCC: Contrast-Based Robust Cross-Subject and Cross-Site Representation Learning for EEG
Xiaobin Wong, Zhonghua Zhao, Haoran Guo, Zhengyi Liu, Yu Wu, Feng Yan, Zhiren Wang, Sen Song
TL;DR
CRCC tackles the challenge of cross-site EEG generalization by decomposing domain shifts into three bias factors and introducing a two-stage training pipeline. The approach combines multi-dataset masked reconstruction pretraining with domain-discriminator signals, followed by fine-tuning that employs contrastive cross-subject/cross-site learning and site-adversarial optimization to distill domain-invariant neural representations. Across a self-constructed seven-site MDD/HC dataset, CRCC achieves consistent improvements over state-of-the-art baselines and delivers strong zero-shot generalization, including a 10.7 percentage-point gain in unseen environments. This work contributes a principled bias-aware framework and a robust multi-site benchmark that enhances the clinical reliability of EEG biomarkers for depression screening.
Abstract
EEG-based neural decoding models often fail to generalize across acquisition sites due to structured, site-dependent biases implicitly exploited during training. We reformulate cross-site clinical EEG learning as a bias-factorized generalization problem, in which domain shifts arise from multiple interacting sources. We identify three fundamental bias factors and propose a general training framework that mitigates their influence through data standardization and representation-level constraints. We construct a standardized multi-site EEG benchmark for Major Depressive Disorder and introduce CRCC, a two-stage training paradigm combining encoder-decoder pretraining with joint fine-tuning via cross-subject/site contrastive learning and site-adversarial optimization. CRCC consistently outperforms state-of-the-art baselines and achieves a 10.7 percentage-point improvement in balanced accuracy under strict zero-shot site transfer, demonstrating robust generalization to unseen environments.
