Point Convergence of Nesterov's Accelerated Gradient Method: An AI-Assisted Proof
Uijeong Jang, Ernest K. Ryu
TL;DR
The paper proves point convergence for Nesterov's accelerated gradient method in convex optimization by analyzing a generalized continuous-time model $\ddot{X}(t)+\frac{r}{t}\dot{X}(t)+\nabla f(X(t))=0$ and translating insights to discrete-time algorithms. It establishes convergence for the critical damping case $r=3$, provides partial results for $r\in(1,3)$, and demonstrates divergence for $r\in(0,1]$ in continuous time; discrete-time results show both NAG and OGM converge to minimizers under standard parameter schedules. The authors also document an AI-assisted discovery process involving ChatGPT, illustrating how AI can accelerate mathematical exploration while separating human verification. This work advances understanding of point convergence in accelerated methods and informs practical deployment of Nesterov-style algorithms.
Abstract
The Nesterov accelerated gradient method, introduced in 1983, has been a cornerstone of optimization theory and practice. Yet the question of its point convergence had remained open. In this work, we resolve this longstanding open problem in the affirmative. The discovery of the proof was heavily assisted by ChatGPT, a proprietary large language model, and we describe the process through which its assistance was
