Table of Contents
Fetching ...

Relaxed Proximal Point Algorithm: Tight Complexity Bounds and Acceleration without Momentum

Bofan Wang, Shiqian Ma, Junfeng Yang, Danqing Zhou

Abstract

In this paper, we focus on the relaxed proximal point algorithm (RPPA) for solving convex (possibly nonsmooth) optimization problems. We conduct a comprehensive study on three types of relaxation schedules: (i) constant schedule with relaxation parameter $α_k\equiv α\in (0, \sqrt{2}]$, (ii) the dynamic schedule put forward by Teboulle and Vaisbourd [TV23], and (iii) the silver stepsize schedule proposed by Altschuler and Parrilo [AP23b]. The latter two schedules were initially investigated for the gradient descent (GD) method and are extended to the RPPA in this paper. For type (i), we establish tight non-ergodic $O(1/N)$ convergence rate results measured by function value residual and subgradient norm, where $N$ denotes the iteration counter. For type (ii), we establish a convergence rate that is tight and approximately $\sqrt{2}$ times better than the constant schedule of type (i). For type (iii), aside from the original silver stepsize schedule put forward by Altschuler and Parrilo, we propose two new modified silver stepsize schedules, and for all the three silver stepsize schedules, $O(1/N^{1.2716})$ accelerated convergence rate results with respect to three different performance metrics are established. Furthermore, our research affirms the conjecture in [LG24][Conjecture 3.2] on GD method with the original silver stepsize schedule.

Relaxed Proximal Point Algorithm: Tight Complexity Bounds and Acceleration without Momentum

Abstract

In this paper, we focus on the relaxed proximal point algorithm (RPPA) for solving convex (possibly nonsmooth) optimization problems. We conduct a comprehensive study on three types of relaxation schedules: (i) constant schedule with relaxation parameter , (ii) the dynamic schedule put forward by Teboulle and Vaisbourd [TV23], and (iii) the silver stepsize schedule proposed by Altschuler and Parrilo [AP23b]. The latter two schedules were initially investigated for the gradient descent (GD) method and are extended to the RPPA in this paper. For type (i), we establish tight non-ergodic convergence rate results measured by function value residual and subgradient norm, where denotes the iteration counter. For type (ii), we establish a convergence rate that is tight and approximately times better than the constant schedule of type (i). For type (iii), aside from the original silver stepsize schedule put forward by Altschuler and Parrilo, we propose two new modified silver stepsize schedules, and for all the three silver stepsize schedules, accelerated convergence rate results with respect to three different performance metrics are established. Furthermore, our research affirms the conjecture in [LG24][Conjecture 3.2] on GD method with the original silver stepsize schedule.

Paper Structure

This paper contains 13 sections, 17 theorems, 65 equations, 2 tables, 1 algorithm.

Key Result

Lemma 2.1

For any $x,y,z \in \mathbb{R}^d$, $s>0$, $\gamma \geq -s$ and $v \in \mathbb{R}$. If $\|x\|^2 \geq \|y\|^2 + s(s+\gamma)\|z\|^2 + sv$ and $v \leq 2\langle y,z\rangle -\gamma \|z\|^2$, then there holds $v \leq \|x\|^2/(2s+\gamma)$.

Theorems & Definitions (31)

  • Lemma 2.1: Extension of TV23
  • Lemma 2.2: Collected from TV23
  • Lemma 2.3: Lower bounds for RPPA
  • proof
  • Lemma 3.1
  • proof
  • Lemma 3.2: Monotonicity of $\|z^{k}-x^{k}\|$
  • proof
  • Lemma 3.3: Double sufficient decrease
  • proof
  • ...and 21 more