Table of Contents
Fetching ...

A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

Ryota Nozawa, Shun Sato, Takayasu Matsuo

TL;DR

The paper reframes Nesterov's accelerated gradient for $L$-smooth convex functions as a variable-step linear multistep method (VLM) for the gradient flow, and develops a stability-consistency framework to analyze such methods. It proves that the NAG-c discretization is stable and consistent as a two-step VLM with linearly growing step sizes, and shows that it is optimal within a natural class of VLMs using a Lyapunov-based rate analysis that yields $O\left(1/n^2\right)$. Extending the analysis to absolute stability, the authors derive stability regions and identify constraints that explain acceleration without violating lower bounds. They further propose an improved VLM tailored for ill-conditioned problems by solving a minimax optimization over method coefficients, and demonstrate through numerical experiments that the new method often outperforms NAG-c in ill-conditioned scenarios, with potential for further extensions to higher-step VLMs.

Abstract

Nesterov's acceleration in continuous optimization can be understood in a novel way when Nesterov's accelerated gradient (NAG) method is considered as a linear multistep (LM) method for gradient flow. Although the NAG method for strongly convex functions (NAG-sc) has been fully discussed, the NAG method for $L$-smooth convex functions (NAG-c) has not. To fill this gap, we show that the existing NAG-c method can be interpreted as a variable step size LM (VLM) for the gradient flow. Surprisingly, the VLM allows linearly increasing step sizes, which explains the acceleration in the convex case. Here, we introduce a novel technique for analyzing the absolute stability of VLMs. Subsequently, we prove that NAG-c is optimal in a certain natural class of VLMs. Finally, we construct a new broader class of VLMs by optimizing the parameters in the VLM for ill-conditioned problems. According to numerical experiments, the proposed method outperforms the NAG-c method in ill-conditioned cases. These results imply that the numerical analysis perspective of the NAG is a promising working environment, and considering a broader class of VLMs could further reveal novel methods.

A novel interpretation of Nesterov's acceleration via variable step-size linear multistep methods

TL;DR

The paper reframes Nesterov's accelerated gradient for -smooth convex functions as a variable-step linear multistep method (VLM) for the gradient flow, and develops a stability-consistency framework to analyze such methods. It proves that the NAG-c discretization is stable and consistent as a two-step VLM with linearly growing step sizes, and shows that it is optimal within a natural class of VLMs using a Lyapunov-based rate analysis that yields . Extending the analysis to absolute stability, the authors derive stability regions and identify constraints that explain acceleration without violating lower bounds. They further propose an improved VLM tailored for ill-conditioned problems by solving a minimax optimization over method coefficients, and demonstrate through numerical experiments that the new method often outperforms NAG-c in ill-conditioned scenarios, with potential for further extensions to higher-step VLMs.

Abstract

Nesterov's acceleration in continuous optimization can be understood in a novel way when Nesterov's accelerated gradient (NAG) method is considered as a linear multistep (LM) method for gradient flow. Although the NAG method for strongly convex functions (NAG-sc) has been fully discussed, the NAG method for -smooth convex functions (NAG-c) has not. To fill this gap, we show that the existing NAG-c method can be interpreted as a variable step size LM (VLM) for the gradient flow. Surprisingly, the VLM allows linearly increasing step sizes, which explains the acceleration in the convex case. Here, we introduce a novel technique for analyzing the absolute stability of VLMs. Subsequently, we prove that NAG-c is optimal in a certain natural class of VLMs. Finally, we construct a new broader class of VLMs by optimizing the parameters in the VLM for ill-conditioned problems. According to numerical experiments, the proposed method outperforms the NAG-c method in ill-conditioned cases. These results imply that the numerical analysis perspective of the NAG is a promising working environment, and considering a broader class of VLMs could further reveal novel methods.
Paper Structure (23 sections, 9 theorems, 79 equations, 13 figures)

This paper contains 23 sections, 9 theorems, 79 equations, 13 figures.

Key Result

Lemma 2.1

Let the method def:eq:vlm be consistent of order $p\ge 0$. A companion matrix $A_n^* \in \mathbb{R}^{ (k-1) \times (k-1) }$ is defined as follows: where $\alpha^*_{k-2,n}=1+\alpha_{k-1,n}$, $\alpha_{0,n}^*=-\alpha_{0,n}$, $\alpha^*_{k-j-1,n}-\alpha^*_{k-j,n}=\alpha_{k-j,n}$$(j=2,...,k-1)$. Then, the method def:eq:vlm is zero-stable if and only if the following conditions hold: where $e_{k-1} = (

Figures (13)

  • Figure 1: Stability region boundary for the VLM \ref{['def:eq:vlm-nag']} provided by the condition in \ref{['thm:vlm stable region']}.
  • Figure 2: NAG-c
  • Figure 3: Proposed method
  • Figure 5: $\frac{1}{2}x^\text{\tiny\sf T} H x$
  • Figure 6: 1D CH problem
  • ...and 8 more figures

Theorems & Definitions (20)

  • Definition 2.1
  • Definition 2.2
  • Lemma 2.1: cf. HNW1987
  • Proposition 2.2
  • Definition 2.3
  • Theorem 3.1
  • proof
  • Theorem 3.2
  • proof
  • Definition 3.1
  • ...and 10 more