Nesterov's accelerated gradient for unbounded convex functions finds the minimum-norm point in the dual space
Keiya Sakabe
TL;DR
This paper studies first-order methods for lower-unbounded convex functions, where $\inf f = -\infty$, and shows that the trajectories of gradient methods diverge in a direction governed by the minimum-norm dual point $p^\star$ of $\overline{\mathrm{dom} \ f^*}$. By linking primal optimization to a dual norm-minimization problem $\min_{p\in\mathrm{dom} f^*} \|p\|^2/2$, the authors reinterpret gradient descent as mirror descent on the dual problem, yielding $\|\nabla f(x_k)-p^\star\|^2 = O(k^{-1})$ and, with Nesterov's acceleration, $p^{(k)}$ and $q^{(k)}$ converging to $p^\star$ at $O(k^{-2})$. The discrete accelerated method thus solves both the primal and the dual norm-minimization with the same $O(k^{-2})$ rate, providing quantitative divergence rates and faster unboundedness certificates. The analysis extends to continuous-time AMD and yields a dual-correspondence with the NAG ODE, and to geometric programming and ellipsoidal projection through numerical results that illustrate the predicted dual-primal dynamics and convergence behavior. Overall, the work offers a unified duality-based framework for detecting and certifying unboundedness while achieving accelerated convergence in the dual space.
Abstract
We study the behavior of first-order methods applied to a lower-unbounded convex function $f$, i.e., $\inf f = -\infty$. Such a setting has received little attention since the trajectories of gradient descent and Nesterov's accelerated gradient method diverge. In this paper, we establish quantitative convergence results describing their speeds and directions of divergence, with implications for unboundedness judgment. A key idea is a relation to a norm-minimization problem in the dual space: minimize $\|p\|^2/2$ over $p \in \mathrm{dom}f^\ast$, which can be naturally solved via mirror descent by taking the Legendre--Fenchel conjugate $f^\ast$ as the distance-generating function. It then turns out that gradient descent for $f$ coincides with mirror descent for this norm-minimization problem, and thus it simultaneously solves both problems at $\mathcal{O}(k^{-1})$. This result admits acceleration; Nesterov's accelerated gradient method, without any modifications, simultaneously solves the original minimization and the dual norm-minimization problems at $\mathcal{O}(k^{-2})$, providing a quantitative characterization of divergence in unbounded convex optimization.
