Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

Tina Behnia; Christos Thrampoulidis

Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

Tina Behnia, Christos Thrampoulidis

TL;DR

This work analyzes supervised contrastive (SC) representation learning under the unconstrained features model (UFM), showing that neural-collapse-like geometry emerges at local optima and that the optimization landscape is benign when the embedding dimension satisfies $d > k$. By reformulating SC as a convex relaxation on the Gram matrix $G=H^ op H$, the authors prove that all local minima are global and that any two global minimizers share the same implicit geometry up to rotation, with a unique global solution in the convex program. They further characterize global solutions under label imbalance: for STEP-imbalanced data, the global optimum has a structured block form, while in the balanced case the optimal geometry reduces to a simplex equiangular tight frame (ETF). These results provide a theoretical foundation for SC-based representation learning in over-parameterized networks and offer insight into how class imbalance shapes embedding geometry, motivating future work on optimization dynamics and broader data regimes.

Abstract

Recent findings reveal that over-parameterized deep neural networks, trained beyond zero training-error, exhibit a distinctive structural pattern at the final layer, termed as Neural-collapse (NC). These results indicate that the final hidden-layer outputs in such networks display minimal within-class variations over the training set. While existing research extensively investigates this phenomenon under cross-entropy loss, there are fewer studies focusing on its contrastive counterpart, supervised contrastive (SC) loss. Through the lens of NC, this paper employs an analytical approach to study the solutions derived from optimizing the SC loss. We adopt the unconstrained features model (UFM) as a representative proxy for unveiling NC-related phenomena in sufficiently over-parameterized deep networks. We show that, despite the non-convexity of SC loss minimization, all local minima are global minima. Furthermore, the minimizer is unique (up to a rotation). We prove our results by formalizing a tight convex relaxation of the UFM. Finally, through this convex formulation, we delve deeper into characterizing the properties of global solutions under label-imbalanced training data.

Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

TL;DR

. By reformulating SC as a convex relaxation on the Gram matrix

, the authors prove that all local minima are global and that any two global minimizers share the same implicit geometry up to rotation, with a unique global solution in the convex program. They further characterize global solutions under label imbalance: for STEP-imbalanced data, the global optimum has a structured block form, while in the balanced case the optimal geometry reduces to a simplex equiangular tight frame (ETF). These results provide a theoretical foundation for SC-based representation learning in over-parameterized networks and offer insight into how class imbalance shapes embedding geometry, motivating future work on optimization dynamics and broader data regimes.

Abstract

Paper Structure (15 sections, 12 theorems, 43 equations)

This paper contains 15 sections, 12 theorems, 43 equations.

Introduction
Related Works
Contributions
Problem Setup
Unconstrained Features Model (UFM)
Practices in Contrastive Learning
Results
NC Holds at Local Optimal Solutions
UFM: Landscape and Optimality Conditions
Proof Sketch
Global Solutions
Discussion
Proof of Lemma \ref{['lem:stationary_NC']}
Proof of Lemma \ref{['lem:unique_main']}
Proof of Lemma \ref{['lem:scl_symm_global']}

Key Result

Proposition 1

Consider the optimization problem $\min_{\mathbf{x}\in\mathcal{C}} f(\mathbf{x})$ over a convex set $\mathcal{C}$. Let $\widehat{\mathbf{x}}\in\mathcal{C}$ be a local minimum. Then,

Theorems & Definitions (20)

Remark 1
Remark 2
Definition 1: Neural-collapse (NC) property
Proposition 1: Necessary stationarity condition bertsekas1999
Lemma 1
Theorem 1: UFM landscape with SC loss
Corollary 1
Lemma 2
Definition 2: $(R,\rho)$-STEP imbalanced data
Lemma 3
...and 10 more

Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

TL;DR

Abstract

Supervised Contrastive Representation Learning: Landscape Analysis with Unconstrained Features

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (20)