Geometric Properties of Neural Multivariate Regression
George Andriopoulos, Zixuan Dong, Bimarsha Adhikari, Keith Ross
TL;DR
The paper addresses why neural regression suffers from geometry-driven generalization limits, contrasting neural collapse in classification with Neural Regression Collapse (NRC). It proposes intrinsic dimension, specifically $ID_H$ for last-layer features and $ID_Y$ for targets, as a refined geometric lens and employs the 2-NN intrinsic-dimension estimator alongside the NRC1 metric. The key findings show that collapsed models have $ID_H < ID_Y$, leading to over-compression and poor generalization, while non-collapsed models typically satisfy $ID_H > ID_Y$ and exhibit regime-dependent generalization behavior. The authors derive a geometric argument using Sard's theorem and identify two practical regimes—over-compressed and under-compressed—under which adjusting feature dimensionality improves performance. These insights yield actionable guidelines for improving generalization in applied neural regression tasks and establish intrinsic-dimension as a principled diagnostic for regression representations.
Abstract
Neural multivariate regression underpins a wide range of domains such as control, robotics, and finance, yet the geometry of its learned representations remains poorly characterized. While neural collapse has been shown to benefit generalization in classification, we find that analogous collapse in regression consistently degrades performance. To explain this contrast, we analyze models through the lens of intrinsic dimension. Across control tasks and synthetic datasets, we estimate the intrinsic dimension of last-layer features (ID_H) and compare it with that of the regression targets (ID_Y). Collapsed models exhibit ID_H < ID_Y, leading to over-compression and poor generalization, whereas non-collapsed models typically maintain ID_H > ID_Y. For the non-collapsed models, performance with respect to ID_H depends on the data quantity and noise levels. From these observations, we identify two regimes (over-compressed and under-compressed) that determine when expanding or reducing feature dimensionality improves performance. Our results provide new geometric insights into neural regression and suggest practical strategies for enhancing generalization.
