Neural collapse with unconstrained features

Dustin G. Mixon; Hans Parshall; Jianzong Pi

Neural collapse with unconstrained features

Dustin G. Mixon, Hans Parshall, Jianzong Pi

TL;DR

The paper addresses why neural collapse emerges during training by proposing an unconstrained-features framework in which training samples have explicit feature columns H ∈ R^{p×CN}. Through gradient-flow analysis, it shows that an invariant subspace S guides dynamics toward a strong neural-collapse state, characterized by WW^T = √N(I_C − (1/C)11^T), H = (1/√N)(W⊗1_N)^T, and b = (1/C)1_C, which satisfies NC1–NC4. The analysis reveals a decomposition of the empirical risk on S and demonstrates convergence to the strong-collapse configuration under mild conditions on the initial W, offering an optimization-geometry explanation for neural collapse. The work clarifies the role of the optimization landscape in inducing symmetric class-mean geometry and suggests directions for extending the theory to generalization and other optimization regimes.

Abstract

Neural collapse is an emergent phenomenon in deep learning that was recently discovered by Papyan, Han and Donoho. We propose a simple "unconstrained features model" in which neural collapse also emerges empirically. By studying this model, we provide some explanation for the emergence of neural collapse in terms of the landscape of empirical risk.

Neural collapse with unconstrained features

TL;DR

Abstract

Neural collapse with unconstrained features

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (8)