Table of Contents
Fetching ...

GUESS: Generative Uncertainty Ensemble for Self Supervision

Salman Mohamadi, Gianfranco Doretto, Donald A. Adjeroh

TL;DR

Self-supervised learning often enforces invariance to augmentations, but blind invariance can harm downstream performance. GUESS introduces data-derived uncertainty into both the loss and the architecture to enable data-dependent invariance, combining a generative uncertainty block with an ensemble framework and a pseudo-whitening objective. The method constructs cross-correlations from encoders (C1) and autoencoders (C2), builds a pseudo-whitening matrix C = I + β C2, and optimizes Lt = Lw + α Lr with Lw = ∑i (1 − Cii)^2 + β ∑i≠j (Cij − C1,ij)^2 and Lr as reconstruction losses, while offering a computationally efficient GUESS-m-E variant using auto-correlation. Across six datasets, GUESS-1 and GUESS-1-E set new baselines for linear and transfer evaluation, with larger ensembles providing further gains, and ablations confirming the value of uncertainty-driven invariance and pretraining of autoencoders for robust representation learning.

Abstract

Self-supervised learning (SSL) frameworks consist of pretext task, and loss function aiming to learn useful general features from unlabeled data. The basic idea of most SSL baselines revolves around enforcing the invariance to a variety of data augmentations via the loss function. However, one main issue is that, inattentive or deterministic enforcement of the invariance to any kind of data augmentation is generally not only inefficient, but also potentially detrimental to performance on the downstream tasks. In this work, we investigate the issue from the viewpoint of uncertainty in invariance representation. Uncertainty representation is fairly under-explored in the design of SSL architectures as well as loss functions. We incorporate uncertainty representation in both loss function as well as architecture design aiming for more data-dependent invariance enforcement. The former is represented in the form of data-derived uncertainty in SSL loss function resulting in a generative-discriminative loss function. The latter is achieved by feeding slightly different distorted versions of samples to the ensemble aiming for learning better and more robust representation. Specifically, building upon the recent methods that use hard and soft whitening (a.k.a redundancy reduction), we introduce a new approach GUESS, a pseudo-whitening framework, composed of controlled uncertainty injection, a new architecture, and a new loss function. We include detailed results and ablation analysis establishing GUESS as a new baseline.

GUESS: Generative Uncertainty Ensemble for Self Supervision

TL;DR

Self-supervised learning often enforces invariance to augmentations, but blind invariance can harm downstream performance. GUESS introduces data-derived uncertainty into both the loss and the architecture to enable data-dependent invariance, combining a generative uncertainty block with an ensemble framework and a pseudo-whitening objective. The method constructs cross-correlations from encoders (C1) and autoencoders (C2), builds a pseudo-whitening matrix C = I + β C2, and optimizes Lt = Lw + α Lr with Lw = ∑i (1 − Cii)^2 + β ∑i≠j (Cij − C1,ij)^2 and Lr as reconstruction losses, while offering a computationally efficient GUESS-m-E variant using auto-correlation. Across six datasets, GUESS-1 and GUESS-1-E set new baselines for linear and transfer evaluation, with larger ensembles providing further gains, and ablations confirming the value of uncertainty-driven invariance and pretraining of autoencoders for robust representation learning.

Abstract

Self-supervised learning (SSL) frameworks consist of pretext task, and loss function aiming to learn useful general features from unlabeled data. The basic idea of most SSL baselines revolves around enforcing the invariance to a variety of data augmentations via the loss function. However, one main issue is that, inattentive or deterministic enforcement of the invariance to any kind of data augmentation is generally not only inefficient, but also potentially detrimental to performance on the downstream tasks. In this work, we investigate the issue from the viewpoint of uncertainty in invariance representation. Uncertainty representation is fairly under-explored in the design of SSL architectures as well as loss functions. We incorporate uncertainty representation in both loss function as well as architecture design aiming for more data-dependent invariance enforcement. The former is represented in the form of data-derived uncertainty in SSL loss function resulting in a generative-discriminative loss function. The latter is achieved by feeding slightly different distorted versions of samples to the ensemble aiming for learning better and more robust representation. Specifically, building upon the recent methods that use hard and soft whitening (a.k.a redundancy reduction), we introduce a new approach GUESS, a pseudo-whitening framework, composed of controlled uncertainty injection, a new architecture, and a new loss function. We include detailed results and ablation analysis establishing GUESS as a new baseline.

Paper Structure

This paper contains 22 sections, 15 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: Schematic depiction of the proposed GUESS framework, with random allocation of augmented views to the blocks of ensemble-$M$. Each block gets its own set of augmented instances.
  • Figure 2: More efficient GUESS framework. For more efficiency, the building block of the ensemble is simplified as shown. Even using this design with one autoencoder (GUESS-1-E) our method outperform former baselines with no extra computational overhead (similar computational complexity to that of BarlowTwins).
  • Figure 3: Sensitivity to beta, for CIFAR10, as shown the top-1 linear accuracy ($\%$) is not very sensitive to beta. Best: $\beta=0.01$.