A Family of Controllable Momentum Coefficients for Forward-Backward Accelerated Algorithms

Mingwei Fu; Bin Shi

A Family of Controllable Momentum Coefficients for Forward-Backward Accelerated Algorithms

Mingwei Fu, Bin Shi

TL;DR

This work introduces a family of controllable momentum coefficients for forward-backward accelerated methods, centered on an $\alpha$-th power momentum form with adaptive $r$ at the critical step size $s=1/L$. By designing a Lyapunov function that excludes kinetic energy and expresses energies in terms of $x_k$ and $y_k$, the authors prove a controllable $O\left(1/k^{2\alpha}\right)$ convergence for NAG-$\alpha$ when $r>2\alpha$, and extend this rate to monotone variants and proximal algorithms, including FISTA-$\alpha$ and M-FISTA-$\alpha$. At the critical step size, tuning $r$ according to $\alpha$ yields inverse-polynomial convergence of arbitrary degree, offering a tunable acceleration mechanism for a broad class of smooth and composite problems. The results bridge the gap between classical acceleration and proximal forward-backward methods, with implications for both theory and practical optimization, though the analysis relies on strong convexity and leaves open the exploration of weaker conditions.

Abstract

Nesterov's accelerated gradient method (NAG) marks a pivotal advancement in gradient-based optimization, achieving faster convergence compared to the vanilla gradient descent method for convex functions. However, its algorithmic complexity when applied to strongly convex functions remains unknown, as noted in the comprehensive review by Chambolle and Pock [2016]. This issue, aside from the critical step size, was addressed by Li et al. [2024b], with the monotonic case further explored by Fu and Shi [2024]. In this paper, we introduce a family of controllable momentum coefficients for forward-backward accelerated methods, focusing on the critical step size $s=1/L$. Unlike traditional linear forms, the proposed momentum coefficients follow an $α$-th power structure, where the parameter $r$ is adaptively tuned to $α$. Using a Lyapunov function specifically designed for $α$, we establish a controllable $O\left(1/k^{2α} \right)$ convergence rate for the NAG-$α$ method, provided that $r > 2α$. At the critical step size, NAG-$α$ achieves an inverse polynomial convergence rate of arbitrary degree by adjusting $r$ according to $α> 0$. We further simplify the Lyapunov function by expressing it in terms of the iterative sequences $x_k$ and $y_k$, eliminating the need for phase-space representations. This simplification enables us to extend the controllable $O \left(1/k^{2α} \right)$ rate to the monotonic variant, M-NAG-$α$, thereby enhancing optimization efficiency. Finally, by leveraging the fundamental inequality for composite functions, we extended the controllable $O\left(1/k^{2α} \right)$ rate to proximal algorithms, including the fast iterative shrinkage-thresholding algorithm (FISTA-$α$) and its monotonic counterpart (M-FISTA-$α$).

A Family of Controllable Momentum Coefficients for Forward-Backward Accelerated Algorithms

TL;DR

This work introduces a family of controllable momentum coefficients for forward-backward accelerated methods, centered on an

-th power momentum form with adaptive

at the critical step size

. By designing a Lyapunov function that excludes kinetic energy and expresses energies in terms of

and

, the authors prove a controllable

convergence for NAG-

when

, and extend this rate to monotone variants and proximal algorithms, including FISTA-

and M-FISTA-

. At the critical step size, tuning

according to

yields inverse-polynomial convergence of arbitrary degree, offering a tunable acceleration mechanism for a broad class of smooth and composite problems. The results bridge the gap between classical acceleration and proximal forward-backward methods, with implications for both theory and practical optimization, though the analysis relies on strong convexity and leaves open the exploration of weaker conditions.

Abstract

. Unlike traditional linear forms, the proposed momentum coefficients follow an

-th power structure, where the parameter

is adaptively tuned to

. Using a Lyapunov function specifically designed for

, we establish a controllable

convergence rate for the NAG-

method, provided that

. At the critical step size, NAG-

achieves an inverse polynomial convergence rate of arbitrary degree by adjusting

according to

. We further simplify the Lyapunov function by expressing it in terms of the iterative sequences

and

, eliminating the need for phase-space representations. This simplification enables us to extend the controllable

rate to the monotonic variant, M-NAG-

, thereby enhancing optimization efficiency. Finally, by leveraging the fundamental inequality for composite functions, we extended the controllable

rate to proximal algorithms, including the fast iterative shrinkage-thresholding algorithm (FISTA-

) and its monotonic counterpart (M-FISTA-

Paper Structure (13 sections, 6 theorems, 37 equations, 1 figure)

This paper contains 13 sections, 6 theorems, 37 equations, 1 figure.

Introduction
Motivation: a family of controllable momentum coefficients
Overview of contributions
Related works and organization
Preliminaries
Lyapnov analysis for forward-backward accelerated algorithms
Construction of a novel Lyapunov function
Controllable $O\left( 1/k^{2\alpha} \right)$ convergence of NAG-$\alpha$
Generalization to FISTA-$\alpha$
Monotonically controllable $O\left( 1/k^{2\alpha} \right)$ convergence
Smooth optimization via M-NAG-$\alpha$
Composite optimization via M-FISTA-$\alpha$
Conclusion and future work

Key Result

Lemma 2.3

Let $\Phi = f + g$ be a composite function with $f \in \mathcal{S}_{\mu, L}^1(\mathbb{R}^d)$ and $g \in \mathcal{F}^0(\mathbb{R}^d)$. Then, the following inequality holds for any step size $s \in (0,1/L]$:

Figures (1)

Figure 1: Iterative progression of function values for NAG-$\alpha$ and M-NAG-$\alpha$ applied to the quadratic function $f(x_1, x_2) = 5 \times 10^{-3}x_1^2 + x_2^2$. The experiments are performed with $s = 1/L = 0.5$.

Theorems & Definitions (11)

Definition 2.1: $s$-Proximal Value
Definition 2.2: $s$-Proximal Subgradient
Lemma 2.3: Lemma 4 in li2024linear2
Theorem 3.1
proof : Proof of \ref{['thm: nag-alpha']}
Remark 3.2
Corollary 3.3
Theorem 3.4
Theorem 4.1
proof : Proof of \ref{['thm: m-nag-alpha']}
...and 1 more

A Family of Controllable Momentum Coefficients for Forward-Backward Accelerated Algorithms

TL;DR

Abstract

A Family of Controllable Momentum Coefficients for Forward-Backward Accelerated Algorithms

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (11)