Revisiting Theory of Contrastive Learning for Domain Generalization

Ali Alvandi; Mina Rezaei

Revisiting Theory of Contrastive Learning for Domain Generalization

Ali Alvandi, Mina Rezaei

TL;DR

This work extends the theoretical analysis of contrastive learning by incorporating both distribution shift within shared label spaces and the introduction of novel label spaces (domain generalization). It formalizes a mean-classifier framework under a shifted-downstream distribution and derives a bias term B(f) that captures representational misalignment, along with generalization bounds that include a finite-sample Gen_M term. The theory is complemented by empirical validation on CIFAR-10-C and the PACS dataset, demonstrating that the mean-shift quantity predicted by B(f) correlates with downstream transfer performance. Overall, the paper provides a principled, bias-aware perspective on how contrastive representations transfer across domains and offers guidance for designing more robust self-supervised objectives.

Abstract

Contrastive learning is among the most popular and powerful approaches for self-supervised representation learning, where the goal is to map semantically similar samples close together while separating dissimilar ones in the latent space. Existing theoretical methods assume that downstream task classes are drawn from the same latent class distribution used during the pretraining phase. However, in real-world settings, downstream tasks may not only exhibit distributional shifts within the same label space but also introduce new or broader label spaces, leading to domain generalization challenges. In this work, we introduce novel generalization bounds that explicitly account for both types of mismatch: domain shift and domain generalization. Specifically, we analyze scenarios where downstream tasks either (i) draw classes from the same latent class space but with shifted distributions, or (ii) involve new label spaces beyond those seen during pretraining. Our analysis reveals how the performance of contrastively learned representations depends on the statistical discrepancy between pretraining and downstream distributions. This extended perspective allows us to derive provable guarantees on the performance of learned representations on average classification tasks involving class distributions outside the pretraining latent class set.

Revisiting Theory of Contrastive Learning for Domain Generalization

TL;DR

Abstract

Revisiting Theory of Contrastive Learning for Domain Generalization

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (14)