Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

Zhe Zhang; Guanghui Lan

Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

Zhe Zhang, Guanghui Lan

TL;DR

This work tackles smooth function-constrained optimization where the objective is $F(x)=f(x)+u(x)$ over a convex set with inequality constraints $g(x)\le 0$. It introduces the Accelerated Constrained Gradient Descent (ACGD), a single-loop method that replaces the standard descent step with a constrained descent derived from a nested Lagrangian, and extends it to ACGD-S for large-scale problems using a sliding technique. The authors establish matching lower bounds and provide adaptive, parameter-free variants with verifiable certificates (FP-gap and PD-gap) to automatically tune Lipschitz-related parameters, achieving near-optimal oracle and computation complexities. Together, these results offer a near-complete characterization of the hardness of smooth function-constrained optimization and demonstrate practical viability for high-dimensional problems with many constraints.

Abstract

While Nesterov's Accelerated Gradient Descent (AGD) efficiently solves constrained problems when the constraint set $X \subseteq \mathbb{R}^n$ is simple and easy to project onto, it remains an open question whether function-constrained problems $\min_{x \in X} \{F(x) : g(x) \leq 0\}$ can be solved as efficiently as unconstrained problems in terms of oracle complexity. We provide an affirmative answer by proposing the Accelerated Constrained Gradient Descent (ACGD) method, a single-loop algorithm that modifies AGD by replacing the descent step with a constrained descent step, adding only a few linear constraints to the prox mapping. ACGD achieves nearly the same oracle complexity as minimizing the optimal Lagrangian function (with the multiplier fixed at its optimal value). We establish matching lower bounds, demonstrating these complexity results are unimprovable. For large-scale problems with many constraints, we introduce ACGD-S, which replaces the computationally demanding constrained descent step with basic matrix-vector multiplications, maintaining optimal oracle and computation complexities. Together, these methods provide a nearly complete characterization of the hardness of smooth function-constrained optimization. We also propose parameter-free adaptive versions that achieve optimal oracle complexity (requiring only the strong convexity modulus) and present encouraging numerical results demonstrating their efficiency.

Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

TL;DR

This work tackles smooth function-constrained optimization where the objective is

over a convex set with inequality constraints

. It introduces the Accelerated Constrained Gradient Descent (ACGD), a single-loop method that replaces the standard descent step with a constrained descent derived from a nested Lagrangian, and extends it to ACGD-S for large-scale problems using a sliding technique. The authors establish matching lower bounds and provide adaptive, parameter-free variants with verifiable certificates (FP-gap and PD-gap) to automatically tune Lipschitz-related parameters, achieving near-optimal oracle and computation complexities. Together, these results offer a near-complete characterization of the hardness of smooth function-constrained optimization and demonstrate practical viability for high-dimensional problems with many constraints.

Abstract

While Nesterov's Accelerated Gradient Descent (AGD) efficiently solves constrained problems when the constraint set

is simple and easy to project onto, it remains an open question whether function-constrained problems

can be solved as efficiently as unconstrained problems in terms of oracle complexity. We provide an affirmative answer by proposing the Accelerated Constrained Gradient Descent (ACGD) method, a single-loop algorithm that modifies AGD by replacing the descent step with a constrained descent step, adding only a few linear constraints to the prox mapping. ACGD achieves nearly the same oracle complexity as minimizing the optimal Lagrangian function (with the multiplier fixed at its optimal value). We establish matching lower bounds, demonstrating these complexity results are unimprovable. For large-scale problems with many constraints, we introduce ACGD-S, which replaces the computationally demanding constrained descent step with basic matrix-vector multiplications, maintaining optimal oracle and computation complexities. Together, these methods provide a nearly complete characterization of the hardness of smooth function-constrained optimization. We also propose parameter-free adaptive versions that achieve optimal oracle complexity (requiring only the strong convexity modulus) and present encouraging numerical results demonstrating their efficiency.

Paper Structure (6 sections, 5 theorems, 42 equations, 2 tables, 1 algorithm)

This paper contains 6 sections, 5 theorems, 42 equations, 2 tables, 1 algorithm.

Introduction
Notations & Assumptions
The Accelerated Constrained Gradient Descent Method
The ACGD method
The Convergence Results
Convergence Analysis of the ACGD Method

Key Result

lemma thmcounterlemma

Let $z^t=(x^{t}; \lambda^{t}, \nu^{t}, \pi_{}^{t}) \in Z$ be given and let $\Lambda_{r}$ denote a certain set of reference $\lambda$'s, If $\max_{\lambda \in \Lambda_{r}, (\nu,\pi) \in [V, \Pi]}Q(z^t; (x^*; \lambda, \nu, \pi)) \leq \epsilon$, we have $F(x^{t}) - F(x^*) \leq \epsilon$ and $\left\lVert [g(x^{t})]_+ \right\rVert_{}\leq \epsilon/r$.

Theorems & Definitions (6)

lemma thmcounterlemma
proof
proposition thmcounterproposition
theorem 1
corollary thmcountercorollary
corollary thmcountercorollary

Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

TL;DR

Abstract

Solving Convex Smooth Function Constrained Optimization Is Almost As Easy As Unconstrained Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Theorems & Definitions (6)