Nonlinear Dynamics In Optimization Landscape of Shallow Neural Networks with Tunable Leaky ReLU

Jingzhou Liu

Nonlinear Dynamics In Optimization Landscape of Shallow Neural Networks with Tunable Leaky ReLU

Jingzhou Liu

TL;DR

This paper analyzes the nonlinear dynamics of a shallow two-layer teacher–student neural network with Gaussian inputs and leaky ReLU activation. It develops a framework based on the $G$-equivariant gradient degree to detect symmetry-bearing bifurcations of critical points from the global minimum as the leaky slope $\alpha$ varies, proving width-invariant bifurcation thresholds and a multi-mode degeneracy at $\alpha=0$. The main result shows that branches of nontrivial critical points bifurcate at three critical values in $\Lambda$ with four $S_k$-Specht symmetry types, and that in the engineering regime $\alpha\in(0,1)$ these bifurcations are subcritical, preserving symmetry; a detailed $k=5$ numerical example illustrates the four possible symmetry types. Overall, the work clarifies how intrinsic permutation symmetries constrain the optimization landscape of wide shallow networks and provides a predictive, symmetry-based lens for gradient dynamics in such models.

Abstract

In this work, we study the nonlinear dynamics of a shallow neural network trained with mean-squared loss and leaky ReLU activation. Under Gaussian inputs and equal layer width k, (1) we establish, based on the equivariant gradient degree, a theoretical framework, applicable to any number of neurons k>= 4, to detect bifurcation of critical points with associated symmetries from global minimum as leaky parameter $α$ varies. Typically, our analysis reveals that a multi-mode degeneracy consistently occurs at the critical number 0, independent of k. (2) As a by-product, we further show that such bifurcations are width-independent, arise only for nonnegative $α$ and that the global minimum undergoes no further symmetry-breaking instability throughout the engineering regime $α$ in range (0,1). An explicit example with k=5 is presented to illustrate the framework and exhibit the resulting bifurcation together with their symmetries.

Nonlinear Dynamics In Optimization Landscape of Shallow Neural Networks with Tunable Leaky ReLU

TL;DR

This paper analyzes the nonlinear dynamics of a shallow two-layer teacher–student neural network with Gaussian inputs and leaky ReLU activation. It develops a framework based on the

-equivariant gradient degree to detect symmetry-bearing bifurcations of critical points from the global minimum as the leaky slope

varies, proving width-invariant bifurcation thresholds and a multi-mode degeneracy at

. The main result shows that branches of nontrivial critical points bifurcate at three critical values in

with four

-Specht symmetry types, and that in the engineering regime

these bifurcations are subcritical, preserving symmetry; a detailed

numerical example illustrates the four possible symmetry types. Overall, the work clarifies how intrinsic permutation symmetries constrain the optimization landscape of wide shallow networks and provides a predictive, symmetry-based lens for gradient dynamics in such models.

Abstract

varies. Typically, our analysis reveals that a multi-mode degeneracy consistently occurs at the critical number 0, independent of k. (2) As a by-product, we further show that such bifurcations are width-independent, arise only for nonnegative

and that the global minimum undergoes no further symmetry-breaking instability throughout the engineering regime

in range (0,1). An explicit example with k=5 is presented to illustrate the framework and exhibit the resulting bifurcation together with their symmetries.

Nonlinear Dynamics In Optimization Landscape of Shallow Neural Networks with Tunable Leaky ReLU

TL;DR

Abstract

Nonlinear Dynamics In Optimization Landscape of Shallow Neural Networks with Tunable Leaky ReLU

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (32)