Control of Overfitting with Physics
Sergei V. Kozyrev, Ilya A Lopatin, Alexander N Pechen
TL;DR
This work connects overfitting control in machine learning to physics and biology by linking stochastic gradient Langevin dynamics to the Eyring formula and by mapping GAN dynamics to a predator–prey system. It argues that learning preferentially occupies wide, low free-energy minima, with temperature tuning guiding exploration versus exploitation, and that coupling discriminator–generator dynamics further biases toward broad likelihood maxima. The paper introduces a branching random-process extension to model populations of discriminators and generators, and it validates the ideas through simulations on multi-well objectives and a Wine dataset, illustrating reduced overfitting and improved generalization. Together, these analogies provide a theoretical lens for understanding generalization and suggest practical mechanisms for improving stability in SGLD and GAN training.
Abstract
While there are many works on the applications of machine learning, not so many of them are trying to understand the theoretical justifications to explain their efficiency. In this work, overfitting control (or generalization property) in machine learning is explained using analogies from physics and biology. For stochastic gradient Langevin dynamics, we show that the Eyring formula of kinetic theory allows to control overfitting in the algorithmic stability approach - when wide minima of the risk function with low free energy correspond to low overfitting. For the generative adversarial network (GAN) model, we establish an analogy between GAN and the predator-prey model in biology. An application of this analogy allows us to explain the selection of wide likelihood maxima and overfitting reduction for GANs.
