A spin-glass model for the loss surfaces of generative adversarial networks
Nicholas P Baskerville, Jonathan P Keating, Francesco Mezzadri, Joseph Najnudel
TL;DR
The authors propose an interacting spin-glass model for GAN loss landscapes by coupling two spherical spin glasses representing the generator and discriminator. They develop a rigorous analysis of the joint complexity and the limiting spectrum of a corresponding Hessian ensemble using Kac-Rice formulae and Random Matrix Theory with a supersymmetric approach, complemented by a Coulomb gas approximation to obtain the asymptotic complexity. Extensions to Hessian-index constrained complexity reveal a two-dimensional banded structure of critical points, offering a qualitative explanation for gradient-descent dynamics and common training outcomes in GANs. Empirical results, including comparisons with DCGAN experiments on CIFAR-10, support the qualitative predictions and demonstrate the potential of physics-inspired models to inform GAN hyperparameter choices and architectural understanding.
Abstract
We present a novel mathematical model that seeks to capture the key design feature of generative adversarial networks (GANs). Our model consists of two interacting spin glasses, and we conduct an extensive theoretical analysis of the complexity of the model's critical points using techniques from Random Matrix Theory. The result is insights into the loss surfaces of large GANs that build upon prior insights for simpler networks, but also reveal new structure unique to this setting.
