A physics-inspired momentum-based gradient method
Jianing Zhang, Rumei Liu
TL;DR
This paper introduces a physics-inspired nonlinear momentum optimization framework by generalizing inertia through an anharmonic kinetic energy $K(v)=\frac{1}{s m^{s-1}} \|v\|_s^s$ and adopting nonlinear damping, resulting in a dissipative Hamiltonian system with energy $E(t)=H(x,p)$ and $\frac{dE}{dt}=-v\cdot D(v)$. The discretized updates $p_k=p_{k-1}-h\nabla V(x_k)-h\nabla \phi(p_{k-1})$, $x_{k+1}=x_k + h \nabla K^*(p_k)$ yield a flexible momentum mechanism that reduces to the Heavy Ball method when $s=2$ and can attenuate high-speed oscillations for $s\approx 1$. The authors demonstrate both theoretical energy decay and practical gains on nonconvex tasks, including a Rosenbrock benchmark and a nanophotonic inverse-design problem, where the nonlinear momentum method achieves faster convergence and greater robustness (e.g., $J(x_*)\approx 0.72$ vs $\sim 0.45$ for a homogeneous sphere). These results underscore the value of physics-inspired dynamical regularization for efficient optimization in high-dimensional, constrained, and nonconvex settings, with potential extensions to mirror-descent hybrids and Hamiltonian Monte Carlo frameworks.
Abstract
In this work, a nonlinear momentum method is introduced to improve the convergence performance of momentum-based gradient optimization algorithms. The method is motivated by the dynamics of non-Newtonian mechanical systems, where conventional momentum schemes can be interpreted as a dynamical model with quadratic kinetic energy and linear damping. Based on this analogy, a generalized optimization dynamics is constructed by extending the kinetic energy formulation and incorporating a nonlinear damping term. An anharmonic kinetic energy function can be employed to represent the inertial effect of accumulated gradient information during the iterations, while the nonlinear damping mechanism enables a more flexible control of the momentum contribution along the convergence trajectory. Numerical experiments indicate that the method exhibits faster convergence and higher robustness compared to classical momentum algorithms. Moreover, its strong performance on nonconvex objectives makes it particularly suitable for inverse photonic design problems. The results suggest that dynamical systems from physics can provide a view towards the development of efficient optimization methods.
