The Implicit Bias of Logit Regularization
Alon Beck, Yohai Bar Sinai, Noam Levi
TL;DR
This work analyzes convex logit regularization, including label smoothing, in linear classifiers and shows that such penalties induce logit clustering around finite targets $z^*$. In Gaussian data or when per-sample losses are quadratic, the optimal weight direction aligns with Fisher's Linear Discriminant, $\boldsymbol{S}\propto \Sigma^{-1}\boldsymbol{\mu}$, and the generalization performance becomes largely insensitive to the exact regularizer form. The authors reveal a shifted interpolation threshold to $\lambda_c=1$ in noiseless-feature regimes and uncover grokking dynamics for weak regularization, along with a proof that optimal generalization is invariant to orthogonal noise scale $\sigma_n$. Empirical validation on Gaussian data and neural-network penultimate embeddings supports the theory and links soft-target regularization to classical discrimination geometry, highlighting the broad efficacy of logit-regularization methods beyond label smoothing.
Abstract
Logit regularization, the addition a convex penalty directly in logit space, is widely used in modern classifiers, with label smoothing as a prominent example. While such methods often improve calibration and generalization, their mechanism remains under-explored. In this work, we analyze a general class of such logit regularizers in the context of linear classification, and demonstrate that they induce an implicit bias of logit clustering around finite per-sample targets. For Gaussian data, or whenever logits are sufficiently clustered, we prove that logit clustering drives the weight vector to align exactly with Fisher's Linear Discriminant. To demonstrate the consequences, we study a simple signal-plus-noise model in which this transition has dramatic effects: Logit regularization halves the critical sample complexity and induces grokking in the small-noise limit, while making generalization robust to noise. Our results extend the theoretical understanding of label smoothing and highlight the efficacy of a broader class of logit-regularization methods.
