Loss-Complexity Landscape and Model Structure Functions
Alexander Kolpakov
TL;DR
The paper tackles the difficulty of the Kolmogorov structure function by introducing computable proxies and casting the loss–complexity trade-off as a free-energy optimization problem. It develops a Legendre–Fenchel dual framework, connects to a statistical-mechanics partition function, and uses Metropolis–Hastings with simulated annealing to approximate the structure function and its dual. A novel information–scattering analogy and a susceptibility-based resonance analysis predict phase-transition-like elbows in model selection, reflecting critical loss–complexity trade-offs. Numerical experiments across linear, tree-based, and deep neural network models validate the theory and demonstrate practical model-selection pathways using Bayesian optimizers. The work provides a rigorous, computable lens on generalization and overfitting that can inform principled hyperparameter tuning and architecture choices, with accessible code and reproducible experiments.
Abstract
We develop a framework for dualizing the Kolmogorov structure function $h_x(α)$, which then allows using computable complexity proxies. We establish a mathematical analogy between information-theoretic constructs and statistical mechanics, introducing a suitable partition function and free energy functional. We explicitly prove the Legendre-Fenchel duality between the structure function and free energy, showing detailed balance of the Metropolis kernel, and interpret acceptance probabilities as information-theoretic scattering amplitudes. A susceptibility-like variance of model complexity is shown to peak precisely at loss-complexity trade-offs interpreted as phase transitions. Practical experiments with linear and tree-based regression models verify these theoretical predictions, explicitly demonstrating the interplay between the model complexity, generalization, and overfitting threshold.
