Statistical Inference of Constrained Stochastic Optimization via Sketched Sequential Quadratic Programming
Sen Na, Michael W. Mahoney
TL;DR
This work develops AI-StoSQP, an online, projection-free approach to constrained stochastic nonlinear optimization that uses a sketching-based inexact Newton step together with an adaptive stepsize rule. The authors prove global almost-sure convergence of the method and establish a local asymptotic normality result for the joint primal-dual iterates, showing that, after proper rescaling, the error converges to a mean-zero Gaussian with a covariance that depends on the sketching distribution. They introduce a practical plug-in covariance estimator for online uncertainty quantification and demonstrate the method on benchmark nonlinear problems (CUTEst) and constrained regression tasks, highlighting the trade-offs between exact and inexact Newton solves. The results show that online inference with StoSQP is feasible and informative for constrained estimation under streaming data, with asymptotic covariance approaching minimax-optimal limits when the Newton system is solved exactly, and a controlled inflation when sketched Newton steps are used. Overall, the paper provides a principled framework for uncertainty quantification in online constrained optimization and offers actionable guidance on the use of sketching in second-order online methods for scalable inference.
Abstract
We consider online statistical inference of constrained stochastic nonlinear optimization problems. We apply the Stochastic Sequential Quadratic Programming (StoSQP) method to solve these problems, which can be regarded as applying second-order Newton's method to the Karush-Kuhn-Tucker (KKT) conditions. In each iteration, the StoSQP method computes the Newton direction by solving a quadratic program, and then selects a proper adaptive stepsize $\barα_t$ to update the primal-dual iterate. To reduce dominant computational cost of the method, we inexactly solve the quadratic program in each iteration by employing an iterative sketching solver. Notably, the approximation error of the sketching solver need not vanish as iterations proceed, meaning that the per-iteration computational cost does not blow up. For the above StoSQP method, we show that under mild assumptions, the rescaled primal-dual sequence $1/\sqrt{\barα_t}\cdot (x_t - x^\star, λ_t - λ^\star)$ converges to a mean-zero Gaussian distribution with a nontrivial covariance matrix depending on the underlying sketching distribution. To perform inference in practice, we also analyze a plug-in covariance matrix estimator. We illustrate the asymptotic normality result of the method both on benchmark nonlinear problems in CUTEst test set and on linearly/nonlinearly constrained regression problems.
