Geometry and Local Recovery of Global Minima of Two-layer Neural Networks at Overparameterization
Leyang Zhang, Yaoyu Zhang, Tao Luo
TL;DR
It is shown how global minima with zero generalization error become geometrically separated from other global minima as the sample size grows; and the local convergence properties and rate of gradient flow dynamics.
Abstract
Under mild assumptions, we investigate the geometry of the loss landscape for two-layer neural networks in the vicinity of global minima. Utilizing novel techniques, we demonstrate: (i) how global minima with zero generalization error become geometrically separated from other global minima as the sample size grows; and (ii) the local convergence properties and rate of gradient flow dynamics. Our results indicate that two-layer neural networks can be locally recovered in the regime of overparameterization.
