Table of Contents
Fetching ...

Complexity-optimal and parameter-free first-order methods for finding stationary points of composite optimization problems

Weiwei Kong

TL;DR

This work addresses finding $\varepsilon$-stationary points for composite nonconvex problems of the form $\phi=f+h$ by introducing parameter-free accelerated proximal methods. The core contribution is PF.APD, complemented by PF.ACG, which operate without knowledge of curvature constants and achieve near-optimal iteration complexity in both convex and nonconvex settings. The algorithms leverage a double-loop proximal framework with adaptive curvature estimates and online checks, enabling parameter-free operation while maintaining strong theoretical guarantees. Extensions to min–max smoothing and penalty frameworks demonstrate practical versatility, and numerical experiments on QSDP, sparse recovery, and LRMC corroborate the method’s effectiveness in real problems.

Abstract

This paper develops and analyzes an accelerated proximal descent method for finding stationary points of nonconvex composite optimization problems. The objective function is of the form $f+h$ where $h$ is a proper closed convex function, $f$ is a differentiable function on the domain of $h$, and $\nabla f$ is Lipschitz continuous on the domain of $h$. The main advantage of this method is that it is "parameter-free" in the sense that it does not require knowledge of the Lipschitz constant of $\nabla f$ or of any global topological properties of $f$. It is shown that the proposed method can obtain an $\varepsilon$-approximate stationary point with iteration complexity bounds that are optimal, up to logarithmic terms over $\varepsilon$, in both the convex and nonconvex settings. Some discussion is also given about how the proposed method can be leveraged in other existing optimization frameworks, such as min-max smoothing and penalty frameworks for constrained programming, to create more specialized parameter-free methods. Finally, numerical experiments are presented to support the practical viability of the method.

Complexity-optimal and parameter-free first-order methods for finding stationary points of composite optimization problems

TL;DR

This work addresses finding -stationary points for composite nonconvex problems of the form by introducing parameter-free accelerated proximal methods. The core contribution is PF.APD, complemented by PF.ACG, which operate without knowledge of curvature constants and achieve near-optimal iteration complexity in both convex and nonconvex settings. The algorithms leverage a double-loop proximal framework with adaptive curvature estimates and online checks, enabling parameter-free operation while maintaining strong theoretical guarantees. Extensions to min–max smoothing and penalty frameworks demonstrate practical versatility, and numerical experiments on QSDP, sparse recovery, and LRMC corroborate the method’s effectiveness in real problems.

Abstract

This paper develops and analyzes an accelerated proximal descent method for finding stationary points of nonconvex composite optimization problems. The objective function is of the form where is a proper closed convex function, is a differentiable function on the domain of , and is Lipschitz continuous on the domain of . The main advantage of this method is that it is "parameter-free" in the sense that it does not require knowledge of the Lipschitz constant of or of any global topological properties of . It is shown that the proposed method can obtain an -approximate stationary point with iteration complexity bounds that are optimal, up to logarithmic terms over , in both the convex and nonconvex settings. Some discussion is also given about how the proposed method can be leveraged in other existing optimization frameworks, such as min-max smoothing and penalty frameworks for constrained programming, to create more specialized parameter-free methods. Finally, numerical experiments are presented to support the practical viability of the method.
Paper Structure (21 sections, 10 theorems, 70 equations, 3 figures, 4 tables, 5 algorithms)

This paper contains 21 sections, 10 theorems, 70 equations, 3 figures, 4 tables, 5 algorithms.

Key Result

Lemma 1

\newlabellem:basic_gd_compl0 Given $z_{0}\in X$, let $\{(z_{k+1},u_{k+1})\}_{k\geq0}$ denote a sequence of iterates satisfying eq:gd_incl--eq:gd_ineq1. Moreover, let $\Delta_{0}$ be as in eq:optimality_residuals, and define Then, for every $k\geq0$,

Figures (3)

  • Figure 1: Plots of the minimum norm of the normalized stationarity residual $\|\bar{v}\|$ over iteration count in the QSDP experiments. The curvature pairs for the plots are $(10^{2},10^{4})$, $(10^{2},10^{5})$, and $(10^{2},10^{6})$ from left-to-right. \newlabelfig:qsdp0
  • Figure 2: Plots of the minimum norm of the normalized stationarity residual $\|\bar{v}\|$ over iteration count in the SVR experiments. The dimensions and upper curvature $(\ell,p)$ for the plots are $(1429,900)$, $(1686,962)$, and $(24938,100)$ from left-to-right. \newlabelfig:svr0
  • Figure 3: The first row presents the downscaled (80$\times$120) reference images $X$ taken from the Berkeley Segmentation Dataset, along with their image IDs (in order). The second and third rows present the results of the LRMC experiments for two of the images. Specifically, each of these rows presents (from left to right) the corrupted image $\Omega$ and the images generated by UPF, ANCF, AIPP, and APD, respectively. \newlabelfig:img_results0

Theorems & Definitions (20)

  • Lemma 1
  • Proof 1
  • Lemma 2
  • Proof 2
  • Lemma 1
  • Proposition 2
  • Proof 3
  • Lemma 3
  • Proof 4
  • Proposition 4
  • ...and 10 more