Geometry of Critical Sets and Existence of Saddle Branches for Two-layer Neural Networks
Leyang Zhang, Yaoyu Zhang, Tao Luo
TL;DR
The paper develops a geometric framework for two-layer neural networks to analyze the full set of critical points representing a given output function. By introducing the critical embedding and critical reduction operators, it shows that non-global critical points form a finite union of branches with a hierarchical, width-dependent structure and provides precise dimension bounds. It also proves that whenever the output can be represented by a narrower network (minimal width $r$ with $r<m$), the corresponding critical set contains saddle branches, illuminating the role of saddles in training dynamics. These results lay a rigorous foundation for understanding optimization landscapes and gradient flows in overparameterized two-layer networks, with implications for training behavior and network design.
Abstract
This paper presents a comprehensive analysis of critical point sets in two-layer neural networks. To study such complex entities, we introduce the critical embedding operator and critical reduction operator as our tools. Given a critical point, we use these operators to uncover the whole underlying critical set representing the same output function, which exhibits a hierarchical structure. Furthermore, we prove existence of saddle branches for any critical set whose output function can be represented by a narrower network. Our results provide a solid foundation to the further study of optimization and training behavior of neural networks.
