Table of Contents
Fetching ...

Efficient Online Mirror Descent Stochastic Approximation for Multi-Stage Stochastic Programming

Junhui Zhang, Patrick Jaillet

TL;DR

This work proposes hypothetical Mirror Descent Stochastic Approximation (MDSA) for the infinite dimensional policies using stochastic conditional gradients, and shows that the proposed MDSA algorithms admit efficient online implementation, which achieves overall gradient complexity linear in $T$, improving exponentially over all existing algorithms.

Abstract

We study the unconstrained and the minimax saddle point variants of the convex multi-stage stochastic programming problem, where consecutive decisions are coupled through the objective functions, rather than through the constraints. We approach the problems from the infinite-dimensional policy perspective, but consider an online setting where only the policies corresponding to the actual realization of the underlying stochastic process is needed. This leads to a trackable formulation, where the dimension of the output is linear in the number of stages $T$. We propose hypothetical Mirror Descent Stochastic Approximation (MDSA) for the infinite dimensional policies using stochastic conditional gradients. By taking advantage of the decomposability of the updates across stages and realizations of the underlying stochastic process, we show that the proposed MDSA algorithms admit efficient online implementation, which achieves overall gradient complexity linear in $T$, improving exponentially over all existing algorithms.

Efficient Online Mirror Descent Stochastic Approximation for Multi-Stage Stochastic Programming

TL;DR

This work proposes hypothetical Mirror Descent Stochastic Approximation (MDSA) for the infinite dimensional policies using stochastic conditional gradients, and shows that the proposed MDSA algorithms admit efficient online implementation, which achieves overall gradient complexity linear in , improving exponentially over all existing algorithms.

Abstract

We study the unconstrained and the minimax saddle point variants of the convex multi-stage stochastic programming problem, where consecutive decisions are coupled through the objective functions, rather than through the constraints. We approach the problems from the infinite-dimensional policy perspective, but consider an online setting where only the policies corresponding to the actual realization of the underlying stochastic process is needed. This leads to a trackable formulation, where the dimension of the output is linear in the number of stages . We propose hypothetical Mirror Descent Stochastic Approximation (MDSA) for the infinite dimensional policies using stochastic conditional gradients. By taking advantage of the decomposability of the updates across stages and realizations of the underlying stochastic process, we show that the proposed MDSA algorithms admit efficient online implementation, which achieves overall gradient complexity linear in , improving exponentially over all existing algorithms.

Paper Structure

This paper contains 28 sections, 15 theorems, 96 equations, 8 figures, 5 algorithms.

Key Result

Lemma 3.1

For any $x\in \mathcal{Q}^o$ and $g\in \mathbb{R}^{n_0}$, consider Then the following holds for any $x'\in \mathcal{Q}$,

Figures (8)

  • Figure 1: Example probability space with $\Omega = \{1,\ldots,6\}$, $\boldsymbol{\xi}_1(w) = 1$, $\boldsymbol{\xi}_2(w) = \lceil w/2\rceil$, $\boldsymbol{\xi}_3(w) = w$. Left: probability space $\Omega$ with $\mathcal{F}_{1:3}$; each node at layer $t$ represent $\mathbf x_t(w)$; $\mathbf x_t$ is measurable w.r.t. $\mathcal{F}_t$ iff $\mathbf x_t(w) = X_t(\boldsymbol{\xi}_{1:t}(w))$ is a constant in each box. Right: policy space $\mathcal{P}$; each node at layer $t$ represents one realization of $\boldsymbol{\xi}_{1:t}=\xi_{1:t}$, and corresponds to $X_t(\xi_{1:t})$. Suppose $w = 2$, only the hatched variables $X_1(1),X_2(1,1),X_3(1,1,2)$ are needed
  • Figure 2: $S'(t,l) = \{(t',l')\in \mathcal{K},~|t-t'|\leq l-l'\}$. Red region: $S'(4,2)$; blue region: $S'(8,4)$.
  • Figure 3: Policies computed using Algorithm \ref{['alg:hypo-ms-u']} for a problem where $T=5$. The $3$ planes represent the policies $X^{(l-2)}$, $X^{(l-1)}$, and $X^{(l)}$ (from top to bottom), computed by Algorithm \ref{['alg:hypo-ms-u']} plane by plane, in the order of $\{X^{(l-2)}(\xi),~\xi\in \Xi\}\to \{X^{(l-1)}(\xi),~\xi\in \Xi\}\to \{X^{(l)}(\xi),~\xi\in \Xi\}$. In each plane, the tree has $5$ layers representing $5$ stages, and each node in (horizontal) layer $t$ represents the policy $X_t$ for one realization of $\boldsymbol{\xi}_{1:t}$. To evaluate $X_t^{(l)}(\xi_{1:t})$ for a particular $\xi_{1:t}$ (the yellow node in the bottom plane), only the yellow nodes connected to it using orange edges in the plane above it are needed. They correspond to $X_{t-1}^{(l-1)}(\xi_{1:(t-1)})$, $X_t^{(l-1)}(\xi_{1:t})$, and $X_{t+1}^{(l-1)}(\xi_{1:t},\widehat{\boldsymbol{\xi}}_{t+1}(\xi_{1:t},\boldsymbol{\zeta}_{t,l}))$, which further need the starred nodes in the top plane connected to them using red, green, and blue edges, respectively.
  • Figure 4: $l'$-th iteration in the evaluation procedure for $X_t^{(l)}(\xi_{1:t})$. Assume that all the hatched nodes $X_{t-1}^{(0:(l-1))}(\xi_{1:(t-1)})$ and $X_t^{(0:(l'-1)}(\xi_{1:t})$ have already been evaluated, then Algorithm \ref{['alg:update-procedure']} 1) calls the evaluation procedure for $X_{t+1}^{(l'-1)}(\xi_{1:t},\widehat{\boldsymbol{\xi}}_{t+1}(\xi_{1:t},\zeta_{t,l'-1}))$ (this is valid since $X_t^{(0:(l'-2)}(\xi_{1:t})$ has already been computed); 2) uses $X_{t-1}^{(l'-1)}(\xi_{1:(t-1)})$, $X_{t}^{(l'-1)}(\xi_{1:t})$, and $X_{t+1}^{(l'-1)}(\xi_{1:t},\widehat{\boldsymbol{\xi}}_{t+1}(\xi_{1:t},\zeta_{t,l'-1}))$ to evaluate $X_t^{(l')}(\xi_{1:t})$.
  • Figure 5: Example of Algorithm \ref{['alg:async']} with $L=2$, following the notations in Figure \ref{['fig:tree']}. Numbers represent the number of updates that has been applied to each node. Only nodes with bold boundaries are visited, and only the starred nodes are the relevant output. At stage $1$, $\xi_1 = 1$ is revealed, then $X_1^{(1)}(1)$, $X_2^{(1)}(1,3)$, and $X_1^{(2)}(1)$ are computed (left 4 figures). At stage $2$, $\xi_2 = 2$ is revealed, then $X_2^{(1)}(1,2)$, $X_3^{(1)}(1,2,3)$, and $X_2^{(2)}(1,2)$ are computed (middle 4 figures). At stage $3$, $\xi_3 = 3$ is revealed, then $X_3^{(1)}(1,2,3)$ are $X_3^{(2)}(1,2,3)$ are computed (right 3 figures).
  • ...and 3 more figures

Theorems & Definitions (28)

  • Lemma 3.1: Lemma 2.1 NemirovskiRobustSA2009
  • Lemma 3.2
  • proof : Proof of Lemma \ref{['lm:MD1']}
  • Lemma 3.3
  • proof : Proof of Lemma \ref{['lm:MD-saddle']}
  • Lemma 3.4
  • Definition 4.1
  • Lemma 4.1
  • proof : Proof of Lemma \ref{['lm:prop-oracle']}
  • Lemma 4.2
  • ...and 18 more