Contrastive Learning with Nasty Noise

Ziruo Zhao

Contrastive Learning with Nasty Noise

Ziruo Zhao

TL;DR

The paper analyzes the theoretical limits of contrastive learning under nasty noise using the classical PAC framework and VC-dimension theory. It derives tight lower bounds on sample complexity for both arbitrary distance functions and $\ell_p$ distances, and shows that adversarial sample modification at rate $\eta$ imposes fundamental constraints on achievable accuracy, including a baseline bound $\epsilon<2\eta$. It also provides matching upper bounds in the classical PAC setting, with refined distance-specific bounds for even/odd $p$ and constant dimensionality, and extends to nasty-noise scenarios with $\Delta$-scaled bounds $n(\epsilon,\delta,\Delta)$ and data-dependent analyses based on $L_{con}$ and $\hat{L}_{con}$, including a binary-case simplification. Overall, the results quantify robustness limits and offer data-dependent tools to bound generalization under adversarial perturbations in contrastive representations.

Abstract

Contrastive learning has emerged as a powerful paradigm for self-supervised representation learning. This work analyzes the theoretical limits of contrastive learning under nasty noise, where an adversary modifies or replaces training samples. Using PAC learning and VC-dimension analysis, lower and upper bounds on sample complexity in adversarial settings are established. Additionally, data-dependent sample complexity bounds based on the l2-distance function are derived.

Contrastive Learning with Nasty Noise

TL;DR

distances, and shows that adversarial sample modification at rate

imposes fundamental constraints on achievable accuracy, including a baseline bound

. It also provides matching upper bounds in the classical PAC setting, with refined distance-specific bounds for even/odd

and constant dimensionality, and extends to nasty-noise scenarios with

-scaled bounds

and data-dependent analyses based on

and

, including a binary-case simplification. Overall, the results quantify robustness limits and offer data-dependent tools to bound generalization under adversarial perturbations in contrastive representations.

Contrastive Learning with Nasty Noise

TL;DR

Abstract

Contrastive Learning with Nasty Noise

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Theorems & Definitions (44)