The Interpolating Information Criterion for Overparameterized Models
Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney
TL;DR
This work addresses model selection in overparameterized settings where interpolating estimators exist and traditional information criteria fail. It develops the Interpolating Information Criterion (IIC) by establishing a Bayesian duality between over- and underparameterized representations and applying a Laplace-type analysis on the interpolating manifold via the coarea area framework. The IIC combines a regularization term that reflects prior misspecification, a sharpness term tied to the Jacobian, and a curvature term comparing ambient and manifold curvature, plus a data-size correction, and specializes to closed forms in linear regression. Empirical results across linear, gamma, polynomial, and diagonal neural-network models show the IIC correlates with predictive losses and helps explain double-descent phenomena, providing a principled, prior-aware, non-asymptotic model selection tool for interpolating regimes.
Abstract
The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit, penalizing model size. However, these criteria are not appropriate in modern settings where overparameterized models tend to perform well. For any overparameterized model, we show that there exists a dual underparameterized model that possesses the same marginal likelihood, thus establishing a form of Bayesian duality. This enables more classical methods to be used in the overparameterized setting, revealing the Interpolating Information Criterion, a measure of model quality that naturally incorporates the choice of prior into the model selection. Our new information criterion accounts for prior misspecification, geometric and spectral properties of the model, and is numerically consistent with known empirical and theoretical behavior in this regime.
