Adaptive directional decomposition methods for nonconvex constrained optimization
Qiankun Shi, Xiao Wang
TL;DR
The paper addresses nonconvex constrained optimization with both equality and inequality constraints in deterministic and stochastic settings by introducing an adaptive directional decomposition framework that splits descent into tangent-space objective reduction and normal-space constraint-violation reduction, with adaptive merit parameters and stepsizes. The deterministic part yields an $O(\epsilon^{-2})$ iteration complexity to reach an $\epsilon$-KKT point under strong LICQ, while stochastic variants (mini-batch and recursive momentum) achieve state-of-the-art high-probability oracle complexities ($\tilde O(\epsilon^{-4},\epsilon^{-6})$ and $\tilde O(\epsilon^{-3},\epsilon^{-5})$, respectively) under sample-wise smoothness assumptions. The framework is demonstrated to be effective through numerical experiments on CUTEst, with insights into the impact of the user-defined mapping $A$ and variance-reduction techniques, and it extends naturally to inequality-constrained problems via a subproblem enforcing descent in the linearized constraints. Overall, the approach provides a unified, first-order methodology with strong theoretical guarantees and practical performance for a broad class of nonconvex constrained optimization tasks.
Abstract
In this paper, we study nonconvex constrained optimization problems with both equality and inequality constraints, covering deterministic and stochastic settings. We propose a novel first-order algorithm framework that employs a decomposition strategy to balance objective reduction and constraint satisfaction, together with adaptive update of stepsizes and merit parameters. Under certain conditions, the proposed adaptive directional decomposition methods attain an iteration complexity of order \(O(ε^{-2})\) for finding an \(ε\)-KKT point in the deterministic setting. In the stochastic setting, we further develop stochastic variants of approaches and analyze their theoretical properties by leveraging the perturbation theory. We establish the high-probability oracle complexity to find an $ε$-KKT point of order \( \tilde O(ε^{-4}, ε^{-6}) \) (resp. \(\tilde O(ε^{-3}, ε^{-5}) \)) for gradient and constraint evaluations, in the absence (resp. presence) of sample-wise smoothness. To the best of our knowledge, the obtained complexity bounds are comparable to, or improve upon, the state-of-the-art results in the literature.
