Self-Supervised Learning Using Nonlinear Dependence
M. Hadi Sepanj, Benyamin Ghojogh, Paul Fieguth
TL;DR
CDSSL presents a unified self-supervised framework that integrates linear correlation and nonlinear dependence through HSIC in RKHS to learn robust image representations. By decomposing losses into eight terms that address sample- and feature-wise, auto- and cross- interactions, CDSSL decorrelates redundancy while aligning augmented views, leading to improved linear and nonlinear downstream performance. Empirical results across MNIST, CIFAR, STL-10, ImageNet-100, and domain-adaptation tasks show CDSSL consistently outperforms VICReg, Barlow Twins, SimCLR, and SSL-HSIC, and is able to closely approach or surpass state-of-the-art methods like DINO on select benchmarks. The work demonstrates the practical impact of incorporating nonlinear dependence via HSIC into SSL, offering a generalizable framework that subsumes several existing SSL strategies and yields more discriminative, transferable representations.
Abstract
Self-supervised learning has gained significant attention in contemporary applications, particularly due to the scarcity of labeled data. While existing SSL methodologies primarily address feature variance and linear correlations, they often neglect the intricate relations between samples and the nonlinear dependencies inherent in complex data--especially prevalent in high-dimensional visual data. In this paper, we introduce Correlation-Dependence Self-Supervised Learning (CDSSL), a novel framework that unifies and extends existing SSL paradigms by integrating both linear correlations and nonlinear dependencies, encapsulating sample-wise and feature-wise interactions. Our approach incorporates the Hilbert-Schmidt Independence Criterion (HSIC) to robustly capture nonlinear dependencies within a Reproducing Kernel Hilbert Space, enriching representation learning. Experimental evaluations on diverse benchmarks demonstrate the efficacy of CDSSL in improving representation quality.
