A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

Ryota Nozawa; Shun Sato; Takayasu Matsuo

A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

Ryota Nozawa, Shun Sato, Takayasu Matsuo

TL;DR

The paper reframes Nesterov's accelerated gradient for $L$-smooth convex functions as a variable-step linear multistep method (VLM) for the gradient flow, and develops a stability-consistency framework to analyze such methods. It proves that the NAG-c discretization is stable and consistent as a two-step VLM with linearly growing step sizes, and shows that it is optimal within a natural class of VLMs using a Lyapunov-based rate analysis that yields $O\left(1/n^2\right)$. Extending the analysis to absolute stability, the authors derive stability regions and identify constraints that explain acceleration without violating lower bounds. They further propose an improved VLM tailored for ill-conditioned problems by solving a minimax optimization over method coefficients, and demonstrate through numerical experiments that the new method often outperforms NAG-c in ill-conditioned scenarios, with potential for further extensions to higher-step VLMs.

Abstract

Nesterov's acceleration in continuous optimization can be understood in a novel way when Nesterov's accelerated gradient (NAG) method is considered as a linear multistep (LM) method for gradient flow. Although the NAG method for strongly convex functions (NAG-sc) has been fully discussed, the NAG method for $L$-smooth convex functions (NAG-c) has not. To fill this gap, we show that the existing NAG-c method can be interpreted as a variable step size LM (VLM) for the gradient flow. Surprisingly, the VLM allows linearly increasing step sizes, which explains the acceleration in the convex case. Here, we introduce a novel technique for analyzing the absolute stability of VLMs. Subsequently, we prove that NAG-c is optimal in a certain natural class of VLMs. Finally, we construct a new broader class of VLMs by optimizing the parameters in the VLM for ill-conditioned problems. According to numerical experiments, the proposed method outperforms the NAG-c method in ill-conditioned cases. These results imply that the numerical analysis perspective of the NAG is a promising working environment, and considering a broader class of VLMs could further reveal novel methods.

A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

TL;DR

The paper reframes Nesterov's accelerated gradient for

-smooth convex functions as a variable-step linear multistep method (VLM) for the gradient flow, and develops a stability-consistency framework to analyze such methods. It proves that the NAG-c discretization is stable and consistent as a two-step VLM with linearly growing step sizes, and shows that it is optimal within a natural class of VLMs using a Lyapunov-based rate analysis that yields

. Extending the analysis to absolute stability, the authors derive stability regions and identify constraints that explain acceleration without violating lower bounds. They further propose an improved VLM tailored for ill-conditioned problems by solving a minimax optimization over method coefficients, and demonstrate through numerical experiments that the new method often outperforms NAG-c in ill-conditioned scenarios, with potential for further extensions to higher-step VLMs.

Abstract

-smooth convex functions (NAG-c) has not. To fill this gap, we show that the existing NAG-c method can be interpreted as a variable step size LM (VLM) for the gradient flow. Surprisingly, the VLM allows linearly increasing step sizes, which explains the acceleration in the convex case. Here, we introduce a novel technique for analyzing the absolute stability of VLMs. Subsequently, we prove that NAG-c is optimal in a certain natural class of VLMs. Finally, we construct a new broader class of VLMs by optimizing the parameters in the VLM for ill-conditioned problems. According to numerical experiments, the proposed method outperforms the NAG-c method in ill-conditioned cases. These results imply that the numerical analysis perspective of the NAG is a promising working environment, and considering a broader class of VLMs could further reveal novel methods.

Paper Structure (23 sections, 9 theorems, 79 equations, 13 figures)

This paper contains 23 sections, 9 theorems, 79 equations, 13 figures.

Introduction
VLMs
Stability
Consistency
Interpretation of the NAG-c as a VLM
Consistency and zero-stability
Absolute stability
Optimality of NAG-c within a natural class of VLMs
Consistency and stability
Convergence rate by Lyapunov functions
NAG-c optimality
Towards improved methods beyond the NAG-c
Convergence analysis for Dahlquist's test equation
Proposed method
Numerical experiment
...and 8 more sections

Key Result

Lemma 2.1

Let the method def:eq:vlm be consistent of order $p\ge 0$. A companion matrix $A_n^* \in \mathbb{R}^{ (k-1) \times (k-1) }$ is defined as follows: where $\alpha^*_{k-2,n}=1+\alpha_{k-1,n}$, $\alpha_{0,n}^*=-\alpha_{0,n}$, $\alpha^*_{k-j-1,n}-\alpha^*_{k-j,n}=\alpha_{k-j,n}$$(j=2,...,k-1)$. Then, the method def:eq:vlm is zero-stable if and only if the following conditions hold: where $e_{k-1} = (

Figures (13)

Figure 1: Stability region boundary for the VLM \ref{['def:eq:vlm-nag']} provided by the condition in \ref{['thm:vlm stable region']}.
Figure 2: NAG-c
Figure 3: Proposed method
Figure 5: $\frac{1}{2}x^\text{\tiny\sf T} H x$
Figure 6: 1D CH problem
...and 8 more figures

Theorems & Definitions (20)

Definition 2.1
Definition 2.2
Lemma 2.1: cf. HNW1987
Proposition 2.2
Definition 2.3
Theorem 3.1
proof
Theorem 3.2
proof
Definition 3.1
...and 10 more

A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

TL;DR

Abstract

A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (13)

Theorems & Definitions (20)