Table of Contents
Fetching ...

Stochastic interior-point methods for smooth conic optimization with applications

Chuan He, Zhanwang Deng

TL;DR

The paper develops a general stochastic interior-point framework for smooth conic optimization, addressing problems of the form $\min_x f(x)$ s.t. $Ax=b$ and $x\in\mathcal{K}$ with accessible stochastic gradients. It introduces four SIPM variants—mini-batch estimators, Polyak momentum, extrapolated Polyak momentum, and recursive momentum—and proves iteration complexities matching the best known stochastic unconstrained results up to polylog factors: $\tilde{O}(\epsilon^{-2})$, $\tilde{O}(\epsilon^{-4})$, $\tilde{O}(\epsilon^{-7/2})$, and $\tilde{O}(\epsilon^{-3})$ respectively. The analysis relies on a logarithmically homogeneous self-concordant barrier, local-norm Lipschitz properties, and barrier-based approximate optimality, yielding convergence to an $\epsilon$-stochastic stationary point under mild assumptions. Numerical experiments on robust linear regression, multi-task relationship learning, and clustering data streams demonstrate the practical effectiveness and efficiency of SIPMs, achieving competitive or superior performance compared with full-batch IPMs and specialized baselines. This framework enables scalable, theory-grounded conic optimization in ML with large datasets and general conic constraints.

Abstract

Conic optimization plays a crucial role in many machine learning (ML) problems. However, practical algorithms for conic constrained ML problems with large datasets are often limited to specific use cases, as stochastic algorithms for general conic optimization remain underdeveloped. To fill this gap, we introduce a stochastic interior-point method (SIPM) framework for general conic optimization, along with four novel SIPM variants leveraging distinct stochastic gradient estimators. Under mild assumptions, we establish the iteration complexity of our proposed SIPMs, which, up to a polylogarithmic factor, match the best-known {results} in stochastic unconstrained optimization. Finally, our numerical experiments on robust linear regression, multi-task relationship learning, and clustering data streams demonstrate the effectiveness and efficiency of our approach.

Stochastic interior-point methods for smooth conic optimization with applications

TL;DR

The paper develops a general stochastic interior-point framework for smooth conic optimization, addressing problems of the form s.t. and with accessible stochastic gradients. It introduces four SIPM variants—mini-batch estimators, Polyak momentum, extrapolated Polyak momentum, and recursive momentum—and proves iteration complexities matching the best known stochastic unconstrained results up to polylog factors: , , , and respectively. The analysis relies on a logarithmically homogeneous self-concordant barrier, local-norm Lipschitz properties, and barrier-based approximate optimality, yielding convergence to an -stochastic stationary point under mild assumptions. Numerical experiments on robust linear regression, multi-task relationship learning, and clustering data streams demonstrate the practical effectiveness and efficiency of SIPMs, achieving competitive or superior performance compared with full-batch IPMs and specialized baselines. This framework enables scalable, theory-grounded conic optimization in ML with large datasets and general conic constraints.

Abstract

Conic optimization plays a crucial role in many machine learning (ML) problems. However, practical algorithms for conic constrained ML problems with large datasets are often limited to specific use cases, as stochastic algorithms for general conic optimization remain underdeveloped. To fill this gap, we introduce a stochastic interior-point method (SIPM) framework for general conic optimization, along with four novel SIPM variants leveraging distinct stochastic gradient estimators. Under mild assumptions, we establish the iteration complexity of our proposed SIPMs, which, up to a polylogarithmic factor, match the best-known {results} in stochastic unconstrained optimization. Finally, our numerical experiments on robust linear regression, multi-task relationship learning, and clustering data streams demonstrate the effectiveness and efficiency of our approach.

Paper Structure

This paper contains 31 sections, 24 theorems, 97 equations, 5 figures, 4 tables, 1 algorithm.

Key Result

Lemma 1

Let $\mu>0$ be given. Suppose that $(x,\lambda)\in\Omega^\circ\times\mathbb{R}^m$ satisfies $\|\nabla\phi_\mu(x) + A^T\lambda\|_x^* \le \mu$, where $\phi_\mu$ and $\Omega^\circ$ are given in def:feas-r. Then, $x$ also satisfies def:1st-sc with $\tilde{\lambda}=\lambda/(1+\mu)$ and any $\epsilon\ge(1

Figures (5)

  • Figure 1: Convergence behavior of the relative objective value and average relative stationary error for each epoch. The first two and last two plots correspond to the 'wine-quality' and 'energy-efficiency' datasets, respectively.
  • Figure 2: Loss per task for the training (top) and validation (bottom).
  • Figure 3: Convergence behavior of the relative objective value and average relative stationary error for each epoch. The first two figures correspond to the problem with five tasks, and the last two correspond to the problem with ten tasks.
  • Figure 4: Visualization of the clustering results obtained by solving \ref{['pb:ceds']} using our SIPMs at the 1st, 333rd, 666th, and 1000th data observations in the stream (from left to right). The first and second rows display the clustering results for the 'spam-base' and 'cover-type' datasets, respectively.
  • Figure 5: Convergence behavior of the relative objective value and average relative stationary error for each epoch. The first two and last two plots correspond to the 'spam-base' and 'cover-type' datasets, respectively.

Theorems & Definitions (54)

  • Lemma 1
  • Definition 1
  • Definition 2
  • Lemma 2
  • Remark 1
  • Lemma 3
  • Lemma 4
  • Theorem 1
  • Theorem 2
  • Remark 2
  • ...and 44 more