Table of Contents
Fetching ...

A Proof of the Exact Convergence Rate of Gradient Descent

Jungbin Kim

TL;DR

This work determines the exact worst-case convergence rate of gradient descent for $\mu$-strongly convex and $L$-smooth functions with a fixed stepsize $\gamma\in(0,2/L)$. The authors develop a duality-based framework leveraging the performance estimation problem (PEP) and reveal a mirror relation between primal and dual analyses via the anti-transpose, enabling a semidefinite reformulation that certifies the rate. The main result provides an explicit rate bound of the form $f(x_N)-f_* \le \tau\|x_0-x_\*\|^2$ with $\tau = L\max\left\{ \frac{1}{1+\gamma L E_N(1-\gamma\mu)}, \frac{1}{1+\gamma L E_N(1-\gamma L)}\right\}$ (equivalently expressed via $E_N$ and $\eta,\rho$), and proves both upper and matching lower bounds, thereby establishing the exact rate. The method resolves the longstanding conjectures by Drori–Teboulle (for $\mu=0$) and Taylor–Hendrickx–Glineur (for $\mu>0$) and provides constructive multiplier selections and computational tools (e.g., MATLAB code) to verify the results, with implications for understanding first-order methods and guiding rate-optimal algorithm design.

Abstract

We prove the exact worst-case convergence rate of gradient descent for smooth strongly convex optimization on $\mathbb{R}^d$. Concretely, assuming that the objective function $f$ is $μ$-strongly convex and $L$-smooth, we identify the smallest possible value of $τ$ for which the inequality $f(x_{N})-f_{*}\leqτ\|x_{0}-x_{*}\|^{2}$ always holds. The result was previously conjectured by Drori and Teboulle for the case $μ=0$, and by Taylor, Hendrickx, and Glineur for the case $μ>0$.

A Proof of the Exact Convergence Rate of Gradient Descent

TL;DR

This work determines the exact worst-case convergence rate of gradient descent for -strongly convex and -smooth functions with a fixed stepsize . The authors develop a duality-based framework leveraging the performance estimation problem (PEP) and reveal a mirror relation between primal and dual analyses via the anti-transpose, enabling a semidefinite reformulation that certifies the rate. The main result provides an explicit rate bound of the form with (equivalently expressed via and ), and proves both upper and matching lower bounds, thereby establishing the exact rate. The method resolves the longstanding conjectures by Drori–Teboulle (for ) and Taylor–Hendrickx–Glineur (for ) and provides constructive multiplier selections and computational tools (e.g., MATLAB code) to verify the results, with implications for understanding first-order methods and guiding rate-optimal algorithm design.

Abstract

We prove the exact worst-case convergence rate of gradient descent for smooth strongly convex optimization on . Concretely, assuming that the objective function is -strongly convex and -smooth, we identify the smallest possible value of for which the inequality always holds. The result was previously conjectured by Drori and Teboulle for the case , and by Taylor, Hendrickx, and Glineur for the case .

Paper Structure

This paper contains 27 sections, 23 theorems, 105 equations, 1 table.

Key Result

Proposition 1.1

When $f\in{\mathcal{F}}_{\mu,L}$, the interpolation inequality holds for all $p,q\in\mathbb{R}^d$.

Theorems & Definitions (40)

  • Proposition 1.1
  • Proposition 1.2
  • Theorem 1.3: drori2014performance, taylor2017smooth
  • Proposition 1.4: drori2014performance, taylor2017smooth
  • Theorem 1.5: rotaru2024exact
  • Proposition 1.6: rotaru2024exact
  • Proposition 2.1
  • proof
  • Proposition 2.2
  • proof
  • ...and 30 more