Table of Contents
Fetching ...

Self-Supervised Learning Using Nonlinear Dependence

M. Hadi Sepanj, Benyamin Ghojogh, Paul Fieguth

TL;DR

CDSSL presents a unified self-supervised framework that integrates linear correlation and nonlinear dependence through HSIC in RKHS to learn robust image representations. By decomposing losses into eight terms that address sample- and feature-wise, auto- and cross- interactions, CDSSL decorrelates redundancy while aligning augmented views, leading to improved linear and nonlinear downstream performance. Empirical results across MNIST, CIFAR, STL-10, ImageNet-100, and domain-adaptation tasks show CDSSL consistently outperforms VICReg, Barlow Twins, SimCLR, and SSL-HSIC, and is able to closely approach or surpass state-of-the-art methods like DINO on select benchmarks. The work demonstrates the practical impact of incorporating nonlinear dependence via HSIC into SSL, offering a generalizable framework that subsumes several existing SSL strategies and yields more discriminative, transferable representations.

Abstract

Self-supervised learning has gained significant attention in contemporary applications, particularly due to the scarcity of labeled data. While existing SSL methodologies primarily address feature variance and linear correlations, they often neglect the intricate relations between samples and the nonlinear dependencies inherent in complex data--especially prevalent in high-dimensional visual data. In this paper, we introduce Correlation-Dependence Self-Supervised Learning (CDSSL), a novel framework that unifies and extends existing SSL paradigms by integrating both linear correlations and nonlinear dependencies, encapsulating sample-wise and feature-wise interactions. Our approach incorporates the Hilbert-Schmidt Independence Criterion (HSIC) to robustly capture nonlinear dependencies within a Reproducing Kernel Hilbert Space, enriching representation learning. Experimental evaluations on diverse benchmarks demonstrate the efficacy of CDSSL in improving representation quality.

Self-Supervised Learning Using Nonlinear Dependence

TL;DR

CDSSL presents a unified self-supervised framework that integrates linear correlation and nonlinear dependence through HSIC in RKHS to learn robust image representations. By decomposing losses into eight terms that address sample- and feature-wise, auto- and cross- interactions, CDSSL decorrelates redundancy while aligning augmented views, leading to improved linear and nonlinear downstream performance. Empirical results across MNIST, CIFAR, STL-10, ImageNet-100, and domain-adaptation tasks show CDSSL consistently outperforms VICReg, Barlow Twins, SimCLR, and SSL-HSIC, and is able to closely approach or surpass state-of-the-art methods like DINO on select benchmarks. The work demonstrates the practical impact of incorporating nonlinear dependence via HSIC into SSL, offering a generalizable framework that subsumes several existing SSL strategies and yields more discriminative, transferable representations.

Abstract

Self-supervised learning has gained significant attention in contemporary applications, particularly due to the scarcity of labeled data. While existing SSL methodologies primarily address feature variance and linear correlations, they often neglect the intricate relations between samples and the nonlinear dependencies inherent in complex data--especially prevalent in high-dimensional visual data. In this paper, we introduce Correlation-Dependence Self-Supervised Learning (CDSSL), a novel framework that unifies and extends existing SSL paradigms by integrating both linear correlations and nonlinear dependencies, encapsulating sample-wise and feature-wise interactions. Our approach incorporates the Hilbert-Schmidt Independence Criterion (HSIC) to robustly capture nonlinear dependencies within a Reproducing Kernel Hilbert Space, enriching representation learning. Experimental evaluations on diverse benchmarks demonstrate the efficacy of CDSSL in improving representation quality.

Paper Structure

This paper contains 28 sections, 16 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: Overview of the CDSSL framework. This figure highlights the novel categorization of dependencies into linear correlation and nonlinear dependence, further divided into sample-wise and feature-wise interactions. By addressing auto- and cross-dependence at both levels, CDSSL unifies existing SSL methods while introducing robust measures to enhance the quality of learned representations. This comprehensive framework bridges gaps in current methods, ensuring diversity and disentanglement in feature learning.
  • Figure 2: Optimal relative values of the regularization hyperparameters for the CDSSL loss function, obtained through grid search, on (a) MNIST with 10 classes (left figure) and (b) CIFAR-100 with 100 classes (right figure). The results highlight the varying importance of different loss terms based on the dataset's class complexity. Cross-correlation and cross-dependence of samples are more prominent in datasets with a larger number of classes.
  • Figure 3: Four UMAP visualizations of the learned embeddings for the MNIST dataset. CDSSL in (c) shows better class separation and more isotropic distributions compared to VICReg and Barlow Twins, highlighting its ability to capture discriminative representations.