Symmetry & Critical Points
Yossi Arjevani
TL;DR
This work develops a geometric mechanism for symmetry breaking in invariant nonconvex optimization by analyzing tangency sets $mho_{oldsymbol{c}}(f)$ emanating from symmetric critical points. It combines differential topology, jet transversality, and o-minimal definability to show that generically the tangency set consists of 1D arcs along which SB occurs, with connected critical points inheriting symmetry constraints. The authors exploit group representation theory to reduce Hessian and higher-order derivative analyses via isotypic decompositions, enabling tractable spectral characterizations and invariant-tensor computations, especially for permutation representations like $(oldsymbol{R}^d,S_d)$ and $(M(d,d),S_d)$. They also establish finite-determinacy and $ heta$-sufficiency of jets to reduce study to finite models, and discuss deep implications for neural network loss landscapes, including SB-driven annihilation of minima as network width grows. Together, these results provide a rigorous framework for understanding and predicting the structure and tractability of invariant nonconvex optimization problems with broad relevance to learning theory and tensor decompositions.
Abstract
Critical points of an invariant function may or may not be symmetric. We prove, however, that if a symmetric critical point exists, those adjacent to it are generically symmetry breaking. This mathematical mechanism is shown to carry important implications for our ability to efficiently minimize invariant nonconvex functions, in particular those associated with neural networks.
