Latent Point Collapse on a Low Dimensional Embedding in Deep Neural Network Classifiers
Luigi Sbailò, Luca Ghiringhelli
TL;DR
This work targets robust, discriminative latent representations in deep classifiers by introducing latent point collapse (LPC): a straightforward regularization that adds a strong $L_2$ penalty on the penultimate-layer latent vector $\mathbf{z}$ to the usual cross-entropy objective, forming a push-pull with the classification loss. As $\gamma$ grows, latent representations from the same class converge to a single point on a fixed-radius shell, yielding Lipschitz continuity and dramatic gains in robustness to input perturbations, alongside improved feature separability. The approach is demonstrated with low-dimensional linear penultimate layers, yielding binary-like latent encoding and convergence toward neural-collapse-like geometry, while remaining compatible with margin-based losses and IB interpretations. Overall, LPC provides a simple, effective regularization that enhances robustness and discriminative embeddings with minimal architectural changes, and can be combined with existing regularizers for further gains.
Abstract
The configuration of latent representations plays a critical role in determining the performance of deep neural network classifiers. In particular, the emergence of well-separated class embeddings in the latent space has been shown to improve both generalization and robustness. In this paper, we propose a method to induce the collapse of latent representations belonging to the same class into a single point, which enhances class separability in the latent space while enforcing Lipschitz continuity in the network. We demonstrate that this phenomenon, which we call \textit{latent point collapse}, is achieved by adding a strong $L_2$ penalty on the penultimate-layer representations and is the result of a push-pull tension developed with the cross-entropy loss function. In addition, we show the practical utility of applying this compressing loss term to the latent representations of a low-dimensional linear penultimate layer. The proposed approach is straightforward to implement and yields substantial improvements in discriminative feature embeddings, along with remarkable gains in robustness to input perturbations.
