Table of Contents
Fetching ...

Random minibatch subgradient algorithms for convex problems with functional constraints

Angelia Nedich, Ion Necoara

TL;DR

The convergence rates depend explicitly on the minibatch size and show when minibatching helps a subgradient scheme with random feasibility updates, known to be optimal for subgradient methods on this class of problems.

Abstract

In this paper we consider non-smooth convex optimization problems with (possibly) infinite intersection of constraints. In contrast to the classical approach, where the constraints are usually represented as intersection of simple sets, which are easy to project onto, in this paper we consider that each constraint set is given as the level set of a convex but not necessarily differentiable function. For these settings we propose subgradient iterative algorithms with random minibatch feasibility updates. At each iteration, our algorithms take a step aimed at only minimizing the objective function and then a subsequent step minimizing the feasibility violation of the observed minibatch of constraints. The feasibility updates are performed based on either parallel or sequential random observations of several constraint components. We analyze the convergence behavior of the proposed algorithms for the case when the objective function is strongly convex and with bounded subgradients, while the functional constraints are endowed with a bounded first-order black-box oracle. For a diminishing stepsize, we prove sublinear convergence rates for the expected distances of the weighted averages of the iterates from the constraint set, as well as for the expected suboptimality of the function values along the weighted averages. Our convergence rates are known to be optimal for subgradient methods on this class of problems. Moreover, the rates depend explicitly on the minibatch size and show when minibatching helps a subgradient scheme with random feasibility updates.

Random minibatch subgradient algorithms for convex problems with functional constraints

TL;DR

The convergence rates depend explicitly on the minibatch size and show when minibatching helps a subgradient scheme with random feasibility updates, known to be optimal for subgradient methods on this class of problems.

Abstract

In this paper we consider non-smooth convex optimization problems with (possibly) infinite intersection of constraints. In contrast to the classical approach, where the constraints are usually represented as intersection of simple sets, which are easy to project onto, in this paper we consider that each constraint set is given as the level set of a convex but not necessarily differentiable function. For these settings we propose subgradient iterative algorithms with random minibatch feasibility updates. At each iteration, our algorithms take a step aimed at only minimizing the objective function and then a subsequent step minimizing the feasibility violation of the observed minibatch of constraints. The feasibility updates are performed based on either parallel or sequential random observations of several constraint components. We analyze the convergence behavior of the proposed algorithms for the case when the objective function is strongly convex and with bounded subgradients, while the functional constraints are endowed with a bounded first-order black-box oracle. For a diminishing stepsize, we prove sublinear convergence rates for the expected distances of the weighted averages of the iterates from the constraint set, as well as for the expected suboptimality of the function values along the weighted averages. Our convergence rates are known to be optimal for subgradient methods on this class of problems. Moreover, the rates depend explicitly on the minibatch size and show when minibatching helps a subgradient scheme with random feasibility updates.

Paper Structure

This paper contains 10 sections, 10 theorems, 117 equations, 3 figures.

Key Result

lemma thmcounterlemma

Let Assumption asum-base(c) and Assumption asum-regularmod hold. Then, we have:

Figures (3)

  • Figure 1: Convergence of minibatch parallel (left) and sequential (right) algorithms for different minibatch sizes: $N=1$ (solid), $N=50$ (dashed) and $N=100$ (dot-dashed).
  • Figure 2: Convergence behavior of the minibatch parallel (solid) and sequential (dashed) algorithms: objective function (left) and feasibility violation (right).
  • Figure 3: Behavior of parallel algorithm with extrapolated stepsize $\beta=1.9/L_N$ (solid) and fixed stepsize $\beta=1.9$ (dashed): objective function (left), feasibility violation (right).

Theorems & Definitions (22)

  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • remark thmcounterremark
  • lemma thmcounterlemma
  • lemma thmcounterlemma
  • proof
  • lemma thmcounterlemma
  • proof
  • ...and 12 more