Gradient Methods with Online Scaling

Wenzhi Gao; Ya-Chi Chu; Yinyu Ye; Madeleine Udell

Gradient Methods with Online Scaling

Wenzhi Gao, Ya-Chi Chu, Yinyu Ye, Madeleine Udell

TL;DR

A framework to accelerate the convergence of gradient-based methods with online learning learns to scale the gradient at each iteration through an online learning algorithm and provably accelerates gradient-based methods asymptotically.

Abstract

We introduce a framework to accelerate the convergence of gradient-based methods with online learning. The framework learns to scale the gradient at each iteration through an online learning algorithm and provably accelerates gradient-based methods asymptotically. In contrast with previous literature, where convergence is established based on worst-case analysis, our framework provides a strong convergence guarantee with respect to the optimal scaling matrix for the iteration trajectory. For smooth strongly convex optimization, our results provide an $O(κ^\star \log(1/\varepsilon)$) complexity result, where $κ^\star$ is the condition number achievable by the optimal preconditioner, improving on the previous $O(\sqrt{n}κ^\star \log(1/\varepsilon))$ result. In particular, a variant of our method achieves superlinear convergence on convex quadratics. For smooth convex optimization, we show for the first time that the widely-used hypergradient descent heuristic improves on the convergence of gradient descent.

Gradient Methods with Online Scaling

TL;DR

Abstract

Gradient Methods with Online Scaling

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (41)