High Probability Complexity Bounds of Trust-Region Stochastic Sequential Quadratic Programming with Heavy-Tailed Noise
Yuchen Fang, Javad Lavaei, Sen Na
TL;DR
This work tackles constrained stochastic optimization with stochastic objective values and deterministic equality constraints. It introduces a trust-region stochastic sequential quadratic programming (TR-SSQP) method that uses probabilistic zeroth-, first-, and second-order oracles to estimate function values, gradients, and Hessians under irreducible, heavy-tailed noise. The authors prove high-probability iteration bounds: $\mathcal{O}(\varepsilon^{-2})$ iterations to obtain a first-order $\varepsilon$-stationary point and $\mathcal{O}(\varepsilon^{-3})$ iterations for a second-order $\varepsilon$-stationary point, with refinements for heavy-tailed and sub-exponential zeroth-order noise. They also analyze sample complexities and demonstrate finite-time almost-sure convergence when the noise moment parameter $\delta>1$. Numerical experiments on the CUTEst set corroborate the theory and highlight the benefits of Hessian-aware variants in practice.
Abstract
In this paper, we consider nonlinear optimization problems with a stochastic objective and deterministic equality constraints. We propose a Trust-Region Stochastic Sequential Quadratic Programming (TR-SSQP) method and establish its high-probability iteration complexity bounds for identifying first- and second-order $ε$-stationary points. In our algorithm, we assume that exact objective values, gradients, and Hessians are not directly accessible but can be estimated via zeroth-, first-, and second-order probabilistic oracles. Compared to existing complexity studies of SSQP methods that rely on a zeroth-order oracle with sub-exponential tail noise (i.e., light-tailed) and focus mostly on first-order stationarity, our analysis accommodates irreducible and heavy-tailed noise in the zeroth-order oracle and significantly extends the analysis to second-order stationarity. We show that under heavy-tailed noise conditions, our SSQP method achieves the same high-probability first-order iteration complexity bounds as in the light-tailed noise setting, while further exhibiting promising second-order iteration complexity bounds. Specifically, the method identifies a first-order $ε$-stationary point in $\mathcal{O}(ε^{-2})$ iterations and a second-order $ε$-stationary point in $\mathcal{O}(ε^{-3})$ iterations with high probability, provided that $ε$ is lower bounded by a constant determined by the irreducible noise level in estimation. We validate our theoretical findings and evaluate the practical performance of our method on CUTEst benchmark test set.
