Table of Contents
Fetching ...

Lower Complexity Bounds of First-order Methods for Affinely Constrained Composite Non-convex Problems

Wei Liu, Qihang Lin, Yangyang Xu

TL;DR

This work establishes fundamental lower bounds on the oracle complexity of first-order methods for affinely constrained composite non-convex non-smooth problems, modeled as $\min F_0(x)=f_0(x)+g(x)$ subject to $A x+b=0$, with $f_0$ $L_f$-smooth and possibly non-convex and $g$ convex and possibly non-smooth. By constructing a carefully designed hard instance ${\mathcal P}$ that couples the affine constraints with a non-smooth regularizer, the authors prove that any first-order method in Algorithm Class 1 requires at least $\mathcal{O}(\kappa([\bar{A};A]) L_f \Delta_{F_0} \epsilon^{-2})$ calls to ${\rm ORACLE}_1$ to reach an $\epsilon$-stationary point, demonstrating that non-smooth regularization can materially increase problem difficulty under affine constraints. They further show a parallel lower bound for Algorithm Class 2 with ${\rm ORACLE}_2$ and, in the extended arXiv work, provide a near-matching upper bound via an IPG-based method, establishing the tightness of the $\epsilon^{-2}$ rate up to logarithmic factors. The results illuminate the critical role of the interaction between regularizers and affine constraints in non-convex optimization and point to open questions about tightening the bounds for the smoother class and broadening to nonlinear constraints.

Abstract

Many recent studies on first-order methods (FOMs) focus on \emph{composite non-convex non-smooth} optimization with linear and/or nonlinear function constraints. Upper (or worst-case) complexity bounds have been established for these methods. However, little can be claimed about their optimality as no lower bound is known, except for a few special \emph{smooth non-convex} cases. In this paper, we make the first attempt to establish lower complexity bounds of FOMs for solving a class of composite non-convex non-smooth optimization with linear constraints. Assuming two different first-order oracles, we establish lower complexity bounds of FOMs to produce a (near) $ε$-stationary point of a problem (and its reformulation) in the considered problem class, for any given tolerance $ε>0$. Our lower bounds indicate that the existence of a non-smooth convex regularizer can evidently increase the difficulty of an affinely constrained regularized problem over its nonregularized counterpart. In addition, we show that our lower bound of FOMs with the second oracle is tight, with a difference of up to a logarithmic factor from an upper complexity bound established in the extended arXiv version of this paper.

Lower Complexity Bounds of First-order Methods for Affinely Constrained Composite Non-convex Problems

TL;DR

This work establishes fundamental lower bounds on the oracle complexity of first-order methods for affinely constrained composite non-convex non-smooth problems, modeled as subject to , with -smooth and possibly non-convex and convex and possibly non-smooth. By constructing a carefully designed hard instance that couples the affine constraints with a non-smooth regularizer, the authors prove that any first-order method in Algorithm Class 1 requires at least calls to to reach an -stationary point, demonstrating that non-smooth regularization can materially increase problem difficulty under affine constraints. They further show a parallel lower bound for Algorithm Class 2 with and, in the extended arXiv work, provide a near-matching upper bound via an IPG-based method, establishing the tightness of the rate up to logarithmic factors. The results illuminate the critical role of the interaction between regularizers and affine constraints in non-convex optimization and point to open questions about tightening the bounds for the smoother class and broadening to nonlinear constraints.

Abstract

Many recent studies on first-order methods (FOMs) focus on \emph{composite non-convex non-smooth} optimization with linear and/or nonlinear function constraints. Upper (or worst-case) complexity bounds have been established for these methods. However, little can be claimed about their optimality as no lower bound is known, except for a few special \emph{smooth non-convex} cases. In this paper, we make the first attempt to establish lower complexity bounds of FOMs for solving a class of composite non-convex non-smooth optimization with linear constraints. Assuming two different first-order oracles, we establish lower complexity bounds of FOMs to produce a (near) -stationary point of a problem (and its reformulation) in the considered problem class, for any given tolerance . Our lower bounds indicate that the existence of a non-smooth convex regularizer can evidently increase the difficulty of an affinely constrained regularized problem over its nonregularized counterpart. In addition, we show that our lower bound of FOMs with the second oracle is tight, with a difference of up to a logarithmic factor from an upper complexity bound established in the extended arXiv version of this paper.

Paper Structure

This paper contains 16 sections, 17 theorems, 111 equations, 1 figure.

Key Result

Proposition 1

By the definitions in eq:xblock through eq:g, it holds

Figures (1)

  • Figure 1: Illustration of the zero-respecting sequences. Each subfigure represents one whole vector ${\mathbf{x}}$ in a matrix format, with the first column corresponding to matrix $[{\mathbf{x}}_1, {\mathbf{x}}_2, \ldots, {\mathbf{x}}_{m/3}]$, the second column to $[{\mathbf{x}}_{m/3+1}, \ldots, {\mathbf{x}}_{2m/3}]$, the last column to $[{\mathbf{x}}_{2m/3+1}, \ldots, {\mathbf{x}}_m]$, and the $i$-th row representing the row vector $[{\mathbf{x}}_1]_i, [{\mathbf{x}}_2]_i,\ldots,[{\mathbf{x}}_m]_i$. A cell is plotted white if all its elements are zero and otherwise in blue. By Lemmas \ref{['lem:nablaf']} and \ref{['lem:kktvio']}, if any row is zero, then ${\mathbf{x}}$ cannot be an $\epsilon$-stationary point of instance $\mathcal{P}$. Starting from ${\mathbf{x}} = \mathbf{0}$, the figure shows how the zero elements are changed to non-zero by using the oracle information. After the first iteration, all elements in the first row can be made non-zero according to Lemma \ref{['lem:iterateguess']}(1). Next, all elements in the first column of the second row can be made non-zero. As the iteration proceeds, in the second column of the second row, the element $[{\mathbf{x}}_{m/3+1}]_2$ is the first to become non-zero by the operator ${\mathbf{A}}{\mathbf{A}}^{\top}(\cdot)$. Then, the operator $\text{prox}_{\eta g}$ makes the next element $[{\mathbf{x}}_{m/3+2}]_2$ non-zero, followed by the operator ${\mathbf{A}}{\mathbf{A}}^{\top}(\cdot)$, which changes $[{\mathbf{x}}_{m/3+3}]_2$ to non-zero, and so on. It needs at least $m/6$ iterations (i.e., oracles) to make the entire second column of the second row non-zero. Then under the action of the operators $\text{prox}_{\eta g}$ and ${\mathbf{A}}{\mathbf{A}}^{\top}(\cdot)$, the element $[{\mathbf{x}}_{2m/3+1}]_2$ can turn to non-zero; by Lemma \ref{['lem:iterateguess']}(3), $[{\mathbf{x}}_{2m/3+1}]_3$ turns to non-zero by using $\nabla f_0$; this process continues.

Theorems & Definitions (22)

  • Claim 1
  • Definition 1
  • Definition 2: instance ${\mathcal{P}}$
  • Proposition 1
  • Lemma 1
  • Lemma 2
  • Proposition 2
  • Lemma 3
  • Lemma 4
  • Lemma 5
  • ...and 12 more