Exploring Neural Network Landscapes: Star-Shaped and Geodesic Connectivity
Zhanran Lin, Puheng Li, Lei Wu
TL;DR
This work analyzes the geometry of over-parameterized neural network loss landscapes, focusing on mode connectivity, star-shaped connectivity, and geodesic connectivity. It establishes that two-layer ReLU networks and linear networks admit $2$-piece linear connections between typical minima under sufficient width, with broader $k$-PL guarantees and a star-center structure that connects multiple minima via simple paths. The normalized geodesic distance between minima is shown to approach the Euclidean distance as width grows, and neuron sparsity induced by SGD helps drive NGD toward unity, indicating a landscape closer to convex. Empirical validation on MNIST and CIFAR-10 corroborates the theoretical findings, demonstrating practically barrier-free fold-lines through a central minimum and near-1 NGD for wide networks.
Abstract
One of the most intriguing findings in the structure of neural network landscape is the phenomenon of mode connectivity: For two typical global minima, there exists a path connecting them without barrier. This concept of mode connectivity has played a crucial role in understanding important phenomena in deep learning. In this paper, we conduct a fine-grained analysis of this connectivity phenomenon. First, we demonstrate that in the overparameterized case, the connecting path can be as simple as a two-piece linear path, and the path length can be nearly equal to the Euclidean distance. This finding suggests that the landscape should be nearly convex in a certain sense. Second, we uncover a surprising star-shaped connectivity: For a finite number of typical minima, there exists a center on minima manifold that connects all of them simultaneously via linear paths. These results are provably valid for linear networks and two-layer ReLU networks under a teacher-student setup, and are empirically supported by models trained on MNIST and CIFAR-10.
