Analysis of Gradient Descent with Varying Step Sizes using Integral Quadratic Constraints

Ram Padmanabhan; Peter Seiler

Analysis of Gradient Descent with Varying Step Sizes using Integral Quadratic Constraints

Ram Padmanabhan, Peter Seiler

TL;DR

Modeling the algorithm as a linear parameter-varying (LPV) system, a parameterized linear matrix inequality condition is constructed that certifies algorithm performance, which is solved using a result for polytopic LPV systems.

Abstract

The framework of Integral Quadratic Constraints (IQCs) is used to perform an analysis of gradient descent with varying step sizes. Two performance metrics are considered: convergence rate and noise amplification. We assume that the step size is produced from a line search and varies in a known interval. Modeling the algorithm as a linear, parameter-varying (LPV) system, we construct a parameterized linear matrix inequality (LMI) condition that certifies algorithm performance, which is solved using a result for polytopic LPV systems. Our results provide convergence rate guarantees when the step size lies within a restricted interval. Moreover, we recover existing rate bounds when this interval reduces to a single point, i.e. a constant step size. Finally, we note that the convergence rate depends only on the condition number of the problem. In contrast, the noise amplification performance depends on the individual values of the strong convexity and smoothness parameters, and varies inversely with them for a fixed condition number.

Analysis of Gradient Descent with Varying Step Sizes using Integral Quadratic Constraints

TL;DR

Abstract

Paper Structure (16 sections, 5 theorems, 43 equations, 4 figures)

This paper contains 16 sections, 5 theorems, 43 equations, 4 figures.

Introduction
Preliminaries
Notation
Problem Formulation
Integral Quadratic Constraints
Analysis with Varying Step Sizes
Convergence Rate
Noise Amplification
Numerical Implementation
Results
Reduction to a Constant Step Size
Results for a Varying Step Size
Concluding Remarks
Proof of Analytical Results
Proof of Proposition \ref{['prop:CR']}
...and 1 more sections

Key Result

Lemma 1

Let $f \in \mathcal{S}(m, L)$ and $\phi = \nabla f$. Then, $\nabla f$ satisfies the pointwise IQC defined by:

Figures (4)

Figure 1: $\phi$ is the nonlinear component we wish to analyze, and is replaced by the constraints it imposes on the input-output pair $(u, y)$. These are written as constraints on $z_k$.
Figure 2: The approximate number of iterations to convergence as a function of the condition number for gradient descent with a varying step size characterized by $c$.
Figure 3: The upper bound on the noise amplification metric as a function of the condition number for gradient descent with a varying step size characterized by $c$.
Figure 4: Tradeoff between noise amplification $\gamma$ and convergence rate $\rho$, based on \ref{['eq:tradeoff']}. A 'faster' algorithm has a larger value of metric $\gamma$, and is thus more sensitive to noise. In this figure, problem dimension $n = 1$ and strong convexity parameter $m = 1$.

Theorems & Definitions (11)

Definition 1: Convergence Rate
Definition 2: Noise Amplification
Definition 3
Definition 4
Lemma 1: Sector IQC, LL2016
Theorem 1: Problem \ref{['problem:CR']}, Convergence Rate
proof
Theorem 2: Problem \ref{['problem:NA']}, Noise Amplification
proof
Proposition 1
...and 1 more

Analysis of Gradient Descent with Varying Step Sizes using Integral Quadratic Constraints

TL;DR

Abstract

Analysis of Gradient Descent with Varying Step Sizes using Integral Quadratic Constraints

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (11)