Table of Contents
Fetching ...

Damped Proximal Augmented Lagrangian Method for weakly-Convex Problems with Convex Constraints

Hari Dahal, Wei Liu, Yangyang Xu

TL;DR

This work develops DPALM, a damped dual-step proximal augmented Lagrangian method for non-convex problems with a weakly convex objective and convex constraints. It proves that a near $ ext{ε}$-KKT point is attainable in $ ilde{O}( ext{ε}^{-2})$ outer iterations, with case-specific inner complexities: $ ilde{O}( ext{ε}^{-2.5})$ for smooth objectives using APG, and $ ilde{O}( ext{ε}^{-3})$ for compositional objectives via Moreau-envelope smoothing. The general weakly-convex case yields an $ ilde{O}( ext{ε}^{-2})$ outer iteration framework with subproblem accuracy governed by the solver choice. Numerical experiments on LCQP, QCQP, robust nonlinear least squares, and ROC-based fairness demonstrate DPALM’s empirical efficiency, outperforming several state-of-the-art methods in gradient evaluations and runtime.

Abstract

We give a damped proximal augmented Lagrangian method (DPALM) for solving problems with a weakly-convex objective and convex linear/nonlinear constraints. Instead of taking a full stepsize, DPALM adopts a damped dual stepsize to ensure the boundedness of dual iterates. We show that DPALM can produce a (near) $\vareps$-KKT point within $O(\vareps^{-2})$ outer iterations if each DPALM subproblem is solved to a proper accuracy. In addition, we establish overall iteration complexity of DPALM when the objective is either a regularized smooth function or in a regularized compositional form. For the former case, DPALM achieves the complexity of $\widetilde{\mathcal{O}}\left(\varepsilon^{-2.5} \right)$ to produce an $\varepsilon$-KKT point by applying an accelerated proximal gradient (APG) method to each DPALM subproblem. For the latter case, the complexity of DPALM is $\widetilde{\mathcal{O}}\left(\varepsilon^{-3} \right)$ to produce a near $\varepsilon$-KKT point by using an APG to solve a Moreau-envelope smoothed version of each subproblem. Our outer iteration complexity and the overall complexity either generalize existing best ones from unconstrained or linear-constrained problems to convex-constrained ones, or improve over the best-known results on solving the same-structured problems. Furthermore, numerical experiments on linearly/quadratically constrained non-convex quadratic programs and linear-constrained robust nonlinear least squares are conducted to demonstrate the empirical efficiency of the proposed DPALM over several state-of-the art methods.

Damped Proximal Augmented Lagrangian Method for weakly-Convex Problems with Convex Constraints

TL;DR

This work develops DPALM, a damped dual-step proximal augmented Lagrangian method for non-convex problems with a weakly convex objective and convex constraints. It proves that a near -KKT point is attainable in outer iterations, with case-specific inner complexities: for smooth objectives using APG, and for compositional objectives via Moreau-envelope smoothing. The general weakly-convex case yields an outer iteration framework with subproblem accuracy governed by the solver choice. Numerical experiments on LCQP, QCQP, robust nonlinear least squares, and ROC-based fairness demonstrate DPALM’s empirical efficiency, outperforming several state-of-the-art methods in gradient evaluations and runtime.

Abstract

We give a damped proximal augmented Lagrangian method (DPALM) for solving problems with a weakly-convex objective and convex linear/nonlinear constraints. Instead of taking a full stepsize, DPALM adopts a damped dual stepsize to ensure the boundedness of dual iterates. We show that DPALM can produce a (near) -KKT point within outer iterations if each DPALM subproblem is solved to a proper accuracy. In addition, we establish overall iteration complexity of DPALM when the objective is either a regularized smooth function or in a regularized compositional form. For the former case, DPALM achieves the complexity of to produce an -KKT point by applying an accelerated proximal gradient (APG) method to each DPALM subproblem. For the latter case, the complexity of DPALM is to produce a near -KKT point by using an APG to solve a Moreau-envelope smoothed version of each subproblem. Our outer iteration complexity and the overall complexity either generalize existing best ones from unconstrained or linear-constrained problems to convex-constrained ones, or improve over the best-known results on solving the same-structured problems. Furthermore, numerical experiments on linearly/quadratically constrained non-convex quadratic programs and linear-constrained robust nonlinear least squares are conducted to demonstrate the empirical efficiency of the proposed DPALM over several state-of-the art methods.
Paper Structure (22 sections, 21 theorems, 137 equations, 5 figures, 3 tables, 2 algorithms)

This paper contains 22 sections, 21 theorems, 137 equations, 5 figures, 3 tables, 2 algorithms.

Key Result

lemma thmcounterlemma

Under Assumptions Assump1--Assump5, let $\{{\mathbf{x}}^k, {\mathbf{y}}^k,{\mathbf{z}}^k\}$ be generated by Alg. alg:ialm such that eq:approx-cond holds for each $k\ge0$. If $\sum_{k\ge0} \beta_k \delta_k < \infty$, then where

Figures (5)

  • Figure 1: Left: the number of gradient evaluations by DPALM with different values of $v_0$ on solving 10 independent random instances of \ref{['problem: qp']} with $n=10$, $d=100$, and $\rho=1$; Middle and Right: values of the adopted dual stepsize $\alpha_k$ and $\beta_k$ on instance #1 and instance #9.
  • Figure 2: Violation to primal feasibility (PF) and dual feasibility (DF) conditions after each outer iteration vs. Number of gradient evaluations by the proposed DPALM, the HiAPeM method in li2021augmented, the LiMEAL in zeng2022moreau, and the NL-IAPIAL method in kong2022iteration on solving instances of \ref{['problem: qp']} with different weak convexity constant $\rho\in \{0.1, 1, 10\}$.
  • Figure 3: Evolution of the $\beta$-values for DPALM and NL-IAPIAL kong2022iteration on LCQP instances after each outer iteration.
  • Figure 4: Violation to the primal feasibility (PF), dual feasibility (DF), and complementary slackness (CS) conditions vs. Number of gradient evaluations by the proposed DPALM, the HiAPeM method in li2021augmented, and NLIAPIAL and IPL(A) method in kong2022iteration on solving instances of \ref{['problem: qcqp']} with weak convexity constant $\rho\in \{0.1, 1, 10\}$.
  • Figure 5: Violation to the PF and DF after each outer iteration vs. Number of gradient evaluations by the proposed DPALM and the Prox-Linear method in drusvyatskiy2019efficiency on solving an instance of \ref{['problem:l-rnls']}, and on solving an instance of\ref{['exp4: main']} with a9akohavi1996scaling and COMPAS angwin2022machine datasets.

Theorems & Definitions (42)

  • remark thmcounterremark: approximate solution ${\mathbf{x}}^{k+1}$
  • definition thmcounterdefinition: Weakly convex function and Moreau envelopedrusvyatskiy2019efficiency
  • definition thmcounterdefinition: Subdifferential of weakly convex functions
  • definition thmcounterdefinition: (near) $\varepsilon$-KKT point
  • lemma thmcounterlemma
  • lemma thmcounterlemma
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • remark thmcounterremark
  • ...and 32 more