A Proof of the Exact Convergence Rate of Gradient Descent
Jungbin Kim
TL;DR
This work determines the exact worst-case convergence rate of gradient descent for $\mu$-strongly convex and $L$-smooth functions with a fixed stepsize $\gamma\in(0,2/L)$. The authors develop a duality-based framework leveraging the performance estimation problem (PEP) and reveal a mirror relation between primal and dual analyses via the anti-transpose, enabling a semidefinite reformulation that certifies the rate. The main result provides an explicit rate bound of the form $f(x_N)-f_* \le \tau\|x_0-x_\*\|^2$ with $\tau = L\max\left\{ \frac{1}{1+\gamma L E_N(1-\gamma\mu)}, \frac{1}{1+\gamma L E_N(1-\gamma L)}\right\}$ (equivalently expressed via $E_N$ and $\eta,\rho$), and proves both upper and matching lower bounds, thereby establishing the exact rate. The method resolves the longstanding conjectures by Drori–Teboulle (for $\mu=0$) and Taylor–Hendrickx–Glineur (for $\mu>0$) and provides constructive multiplier selections and computational tools (e.g., MATLAB code) to verify the results, with implications for understanding first-order methods and guiding rate-optimal algorithm design.
Abstract
We prove the exact worst-case convergence rate of gradient descent for smooth strongly convex optimization on $\mathbb{R}^d$. Concretely, assuming that the objective function $f$ is $μ$-strongly convex and $L$-smooth, we identify the smallest possible value of $τ$ for which the inequality $f(x_{N})-f_{*}\leqτ\|x_{0}-x_{*}\|^{2}$ always holds. The result was previously conjectured by Drori and Teboulle for the case $μ=0$, and by Taylor, Hendrickx, and Glineur for the case $μ>0$.
