Variance Covariance Regularization Enforces Pairwise Independence in Self-Supervised Representations
Grégoire Mialon, Randall Balestriero, Yann LeCun
TL;DR
This paper demonstrates that Variance-Covariance Regularization (VCReg) applied to the SSL projector enforces pairwise independence among encoder features by connecting projector-output decorrelation to kernel-based independence criteria. It provides theoretical arguments and empirical evidence that wider, potentially random, MLP projectors yield stronger pairwise independence and that this property correlates with better out-of-domain generalization. The authors extend the VCReg viewpoint to related methods like BarlowTwins and W-MSE, and show that VCReg can solve linear ICA with random projectors but not nonlinear ICA, offering a new lens on the role of the projector in SSL and signaling potential new applications beyond SSL. Overall, the work supplies a theoretical foundation for the use of MLP projectors in SSL, proposes HSIC as a candidate model-selection metric, and suggests broader impacts in representation learning and ICA.
Abstract
Self-Supervised Learning (SSL) methods such as VICReg, Barlow Twins or W-MSE avoid collapse of their joint embedding architectures by constraining or regularizing the covariance matrix of their projector's output. This study highlights important properties of such strategy, which we coin Variance-Covariance regularization (VCReg). More precisely, we show that {\em VCReg combined to a MLP projector enforces pairwise independence between the features of the learned representation}. This result emerges by bridging VCReg applied on the projector's output to kernel independence criteria applied on the projector's input. We empirically validate our findings where (i) we put in evidence which projector's characteristics favor pairwise independence, (ii) we demonstrate pairwise independence to be beneficial for out-of-domain generalization, (iii) we demonstrate that the scope of VCReg goes beyond SSL by using it to solve Independent Component Analysis. This provides the first theoretical motivation and explanation of MLP projectors in SSL.
