Table of Contents
Fetching ...

Stochastic Gradient Descent for Constrained Optimization based on Adaptive Relaxed Barrier Functions

Naum Dimitrieski, Jing Cao, Christian Ebenbauer

TL;DR

This work proposes a stochastic gradient method for constrained optimization that uses an adaptive relaxed barrier function to handle inequality constraints with partial gradient information. By sampling both objective components and constraints and updating via $x_{k+1}=x_k-\gamma_k(\nabla f_{i_k}(x_k)+\nabla B(g_{j_k}(x_k),\delta_k))$, the authors prove almost-sure convergence to the central-path limit $x^{*}(\delta_\infty)$ under strong convexity of $f$ and affine constraints, given appropriate step-size and barrier-decay schemes. The relaxed barrier $B(z,\delta)$ enables avoidance of projections and maintains constant memory regardless of the number of constraints, with simulations showing favorable performance for problems with a very large number of constraints. The results provide a first theoretical and numerical demonstration of adaptive relaxed barrier SGD as a scalable approach to constrained optimization in large-scale settings. The approach has potential impact in deep learning and parallel computing contexts where constraint sets are enormous and full constraint information is impractical to process each iteration.

Abstract

This paper presents a novel stochastic gradient descent algorithm for constrained optimization. The proposed algorithm randomly samples constraints and components of the finite sum objective function and relies on a relaxed logarithmic barrier function that is appropriately adapted in each optimization iteration. For a strongly convex objective function and affine inequality constraints, step-size rules and barrier adaptation rules are established that guarantee asymptotic convergence with probability one. The theoretical results in the paper are complemented by numerical studies which highlight potential advantages of the proposed algorithm for optimization problems with a large number of constraints.

Stochastic Gradient Descent for Constrained Optimization based on Adaptive Relaxed Barrier Functions

TL;DR

This work proposes a stochastic gradient method for constrained optimization that uses an adaptive relaxed barrier function to handle inequality constraints with partial gradient information. By sampling both objective components and constraints and updating via , the authors prove almost-sure convergence to the central-path limit under strong convexity of and affine constraints, given appropriate step-size and barrier-decay schemes. The relaxed barrier enables avoidance of projections and maintains constant memory regardless of the number of constraints, with simulations showing favorable performance for problems with a very large number of constraints. The results provide a first theoretical and numerical demonstration of adaptive relaxed barrier SGD as a scalable approach to constrained optimization in large-scale settings. The approach has potential impact in deep learning and parallel computing contexts where constraint sets are enormous and full constraint information is impractical to process each iteration.

Abstract

This paper presents a novel stochastic gradient descent algorithm for constrained optimization. The proposed algorithm randomly samples constraints and components of the finite sum objective function and relies on a relaxed logarithmic barrier function that is appropriately adapted in each optimization iteration. For a strongly convex objective function and affine inequality constraints, step-size rules and barrier adaptation rules are established that guarantee asymptotic convergence with probability one. The theoretical results in the paper are complemented by numerical studies which highlight potential advantages of the proposed algorithm for optimization problems with a large number of constraints.

Paper Structure

This paper contains 9 sections, 2 theorems, 57 equations, 4 figures.

Key Result

Theorem 1

Consider problem problem:2 and assume that Assumptionsassump:1 - assump:3 hold. Let $\{\gamma_k\}_{k \geq 0}$, $\{\varepsilon_k\}_{k \geq 0}$ and $\{\delta_k\}_{k \geq 0}$ be positive sequences such that with $\delta_k = \delta_{\infty} + \varepsilon_k$ and (small) $\delta_{\infty} > 0$. Then for any sequence of random variables $\{X_k\}_{k \geq 0}$ defined through algorithm SGD it holds that

Figures (4)

  • Figure 1: Relaxed barrier function \ref{['barrier_fcn']} with different values of $\delta > 0$.
  • Figure 2: Simulation results for an optimization problem with $10^{4}$ constraints. Mean value and estimated standard deviation interval about the mean value are plotted, obtained from a sample set of $1000$ sample trajectories.
  • Figure 3: Sample trajectories of the proposed SGD algorithm projected onto the $x_1-x_2$ plane. Parameters tuned as in \ref{['fig:slow_epsilon_k']}(with $\varepsilon_k = 5k^{-0.3}$ and ${\varepsilon_k = 5k^{-1.3}}$).
  • Figure 4: Execution time of the proposed SGD algorithm and the deterministic GD algorithm as a function of the number of inequality constraints. 200 sample trajectories were generated for each optimization problem.

Theorems & Definitions (4)

  • Definition 1
  • Theorem 1
  • lemma 1
  • proof : Proof of \ref{['theorem:convergence']}