Table of Contents
Fetching ...

Online Inference of Constrained Optimization: Primal-Dual Optimality and Sequential Quadratic Programming

Yihang Gao, Michael K. Ng, Michael W. Mahoney, Sen Na

TL;DR

This work addresses online inference for stochastic optimization under equality and inequality constraints, proposing a stochastic sequential quadratic programming (SSQP) method augmented with Polyak-style momentum to debias step directions. By incorporating constraint relaxation and adaptive stepsizes, SSQP achieves global almost-sure convergence and local primal–dual minimax optimality, with a plug-in covariance estimator enabling practical online inference. Theoretical results are complemented by comprehensive experiments on nonlinear benchmarks, constrained generalized linear models, and portfolio allocation problems, demonstrating accurate inference and competitive performance without reliance on projection operators. The approach offers a principled, scalable framework for online constrained learning with valid uncertainty quantification in real-world applications.

Abstract

We study online statistical inference for the solutions of stochastic optimization problems with equality and inequality constraints. Such problems are prevalent in statistics and machine learning, encompassing constrained $M$-estimation, physics-informed models, safe reinforcement learning, and algorithmic fairness. We develop a stochastic sequential quadratic programming (SSQP) method to solve these problems, where the step direction is computed by sequentially performing a quadratic approximation of the objective and a linear approximation of the constraints. Despite having access to unbiased estimates of population gradients, a key challenge in constrained stochastic problems lies in dealing with the bias in the step direction. As such, we apply a momentum-style gradient moving-average technique within SSQP to debias the step. We show that our method achieves global almost-sure convergence and exhibits local asymptotic normality with an optimal primal-dual limiting covariance matrix in the sense of Hájek and Le Cam. In addition, we provide a plug-in covariance matrix estimator for practical inference. To our knowledge, the proposed SSQP method is the first fully online method that attains primal-dual asymptotic minimax optimality without relying on projection operators onto the constraint set, which are generally intractable for nonlinear problems. Through extensive experiments on benchmark nonlinear problems, as well as on constrained generalized linear models and portfolio allocation problems using both synthetic and real data, we demonstrate superior performance of our method, showing that the method and its asymptotic behavior not only solve constrained stochastic problems efficiently but also provide valid and practical online inference in real-world applications.

Online Inference of Constrained Optimization: Primal-Dual Optimality and Sequential Quadratic Programming

TL;DR

This work addresses online inference for stochastic optimization under equality and inequality constraints, proposing a stochastic sequential quadratic programming (SSQP) method augmented with Polyak-style momentum to debias step directions. By incorporating constraint relaxation and adaptive stepsizes, SSQP achieves global almost-sure convergence and local primal–dual minimax optimality, with a plug-in covariance estimator enabling practical online inference. Theoretical results are complemented by comprehensive experiments on nonlinear benchmarks, constrained generalized linear models, and portfolio allocation problems, demonstrating accurate inference and competitive performance without reliance on projection operators. The approach offers a principled, scalable framework for online constrained learning with valid uncertainty quantification in real-world applications.

Abstract

We study online statistical inference for the solutions of stochastic optimization problems with equality and inequality constraints. Such problems are prevalent in statistics and machine learning, encompassing constrained -estimation, physics-informed models, safe reinforcement learning, and algorithmic fairness. We develop a stochastic sequential quadratic programming (SSQP) method to solve these problems, where the step direction is computed by sequentially performing a quadratic approximation of the objective and a linear approximation of the constraints. Despite having access to unbiased estimates of population gradients, a key challenge in constrained stochastic problems lies in dealing with the bias in the step direction. As such, we apply a momentum-style gradient moving-average technique within SSQP to debias the step. We show that our method achieves global almost-sure convergence and exhibits local asymptotic normality with an optimal primal-dual limiting covariance matrix in the sense of Hájek and Le Cam. In addition, we provide a plug-in covariance matrix estimator for practical inference. To our knowledge, the proposed SSQP method is the first fully online method that attains primal-dual asymptotic minimax optimality without relying on projection operators onto the constraint set, which are generally intractable for nonlinear problems. Through extensive experiments on benchmark nonlinear problems, as well as on constrained generalized linear models and portfolio allocation problems using both synthetic and real data, we demonstrate superior performance of our method, showing that the method and its asymptotic behavior not only solve constrained stochastic problems efficiently but also provide valid and practical online inference in real-world applications.

Paper Structure

This paper contains 49 sections, 29 theorems, 181 equations, 5 figures, 5 tables.

Key Result

Theorem 2.2

For any point ${\boldsymbol{\ell}}\leq {\boldsymbol{x}} \leq {\boldsymbol{u}}$, if EGMFCQ holds at ${\boldsymbol{x}}$, there exists a threshold $\bar{\theta}\in(0, 1]$ such that for any $\theta\in[0, \bar{\theta}]$, Conversely, suppose a sequence of points ${\boldsymbol{\ell}}\leq {\boldsymbol{x}}_k\leq {\boldsymbol{u}}$, $\forall k\geq 0$, admits a sequence of "sharp" $\theta_k\in(0, 1]$ in the

Figures (5)

  • Figure 1: Insights of our constraint relaxation.
  • Figure 2: Illustration of bias in step direction estimation.
  • Figure 3: Boxplots of KKT residuals and feasibility errors for CUTEst problems. For each noise setting, three boxplots are shown, corresponding to the SSQP (ours), Biased-SSQP, and ActiveSet-SSQP methods.
  • Figure 4: Differences between the averaged gradients and the true gradients on HS32 and FCCU problems. Solid lines: trajectories of gradient difference between the averaged gradients and the exact gradients during iterations, i.e., $\left\| \Bar{\bm{g}}_k -\nabla f_{k}\right\|$. Dashed lines: expected error without averaging, i.e., $\mathbb{E} \left[\left\| \nabla F({\boldsymbol{x}}_k;\zeta_k) -\nabla f_{k}\right\|\mid {\boldsymbol{x}}_k \right]$.
  • Figure 5: Trajectories of portfolio weights and corresponding stock returns. We show two stocks under two portfolio models. The blue lines depict the predicted weights for the stock, while the shaded blue bands indicate the estimated standard deviations of these weights, computed based on the derived asymptotic normality results. The yellow lines depict the returns of the same stock.$\quad\;\;$

Theorems & Definitions (34)

  • Definition 2.1: EGMFCQ v.s. LICQ
  • Theorem 2.2
  • Theorem 2.5: Local primal-dual minimax optimality
  • Remark 3.1: Uniqueness of subproblem dual solution
  • Remark 3.2: Adaptive stepsize
  • Remark 3.3: Bias in step direction
  • Lemma 3.7
  • Theorem 3.8
  • Theorem 3.9
  • Lemma 4.2
  • ...and 24 more