From Adam to Adam-Like Lagrangians: Second-Order Nonlocal Dynamics
Carlos Heredia
TL;DR
The paper addresses the lack of a principled dynamical understanding of Adam by formulating a second-order, nonlocal continuous-time model with causal memory kernels that capture past-gradient influence. It shows that, as $α\to0$, this accelerated IDE reduces to the established first-order nonlocal Adam flow on fixed horizons away from the initial time, with a quantified perturbation governed by $ρ=\max\{α, (α/(1-β_1))^2, α/(1-β_2)\}$. A Lyapunov-based stability framework yields dissipation and convergence results under standard smoothness assumptions, with PL and KL structures providing exponential or rate-based decay up to $O(ρ)$-dependent neighborhoods, and a nonlocal Lagrangian viewpoint provides an ideal reciprocity-guided variational blueprint for optimizer design. Numerical experiments on Rosenbrock-type landscapes validate the model, showing that the second-order nonlocal dynamics closely tracks discrete Adam and offers improved accuracy in the small-step regime, while illustrating memory-related behavior such as basin transitions and moment positivity constraints.
Abstract
In this paper, we derive an accelerated continuous-time formulation of Adam by modeling it as a second-order integro-differential dynamical system. We relate this inertial nonlocal model to an existing first-order nonlocal Adam flow through an $α$-refinement limit, and we provide Lyapunov-based stability and convergence analyses. We also introduce an Adam-inspired nonlocal Lagrangian formulation, offering a variational viewpoint. Numerical simulations on Rosenbrock-type examples show agreement between the proposed dynamics and discrete Adam.
