Dual dynamic programming for stochastic programs over an infinite horizon

Caleb Ju; Guanghui Lan

Dual dynamic programming for stochastic programs over an infinite horizon

Caleb Ju, Guanghui Lan

TL;DR

The paper tackles infinite-horizon, stationary multi-stage stochastic programs with discounted costs by developing CE-Inf-EDDP, a continually-exploring variant of explorative dual dynamic programming that shares the favorable complexity of state-of-the-art methods while improving practical performance. It introduces a basic Inf-EDDP framework with forward/backward phases, a stage-wise SAA, and a saturation-based trial-point strategy that propagates cutting-plane updates across stages, leading to non-asymptotic convergence guarantees. To further reduce dependence on the effective horizon $T$, it proposes Case 1 and Case 2 variants (collectively CE-Inf-EDDP), including Case 1 with a theoretical bound $K=4T(D/\epsilon+1)^n$, and extends the approach to hierarchical stationary problems via 2-stage stochastic approximation (2SSA) inside a CE-Inf-HDDP framework with provable sample and iteration complexities. The authors support their methods with extensive numerical experiments on infinite-horizon newsvendor, risk-averse variants, hydrothermal planning, and a newsvendor with secondary assembly, demonstrating tight dual bounds, favorable runtimes, and effective policy generalization compared with finite-horizon methods and randomized SDDP variants. The work contributes non-asymptotic analysis, horizon-robust exploration, and a scalable hierarchy-capable DDP toolkit, with open-source code to promote reproducibility and application in energy and supply-chain domains.

Abstract

We consider solving stochastic programs over an infinite horizon. By leveraging the stationarity of problem, we develop a novel continually-exploring infinite-horizon explorative dual dynamic programming (CE-Inf-EDDP) algorithm that matches state-of-the-art complexity while providing encouraging numerical performance on the newsvendor and hydrothermal planning problem. CE-Inf-EDDP conceptually differs from previous dual dynamic programming approaches by exploring the feasible region longer and updating the cutting-plane model more frequently. In addition, our algorithm can handle both simple linear to more complex nonlinear costs. To demonstrate this, we extend our algorithm to handle the so-called hierarchical stationary stochastic program, where the cost function is a parametric multi-stage stochastic program. The hierarchical program can model problems with a hierarchy of decision-making, e.g., how long-term decisions influence day-to-day operations. As a concrete example, we introduce a newsvendor problem that includes a second-stage multi-product assembly serving as a secondary market.

Dual dynamic programming for stochastic programs over an infinite horizon

TL;DR

, it proposes Case 1 and Case 2 variants (collectively CE-Inf-EDDP), including Case 1 with a theoretical bound

, and extends the approach to hierarchical stationary problems via 2-stage stochastic approximation (2SSA) inside a CE-Inf-HDDP framework with provable sample and iteration complexities. The authors support their methods with extensive numerical experiments on infinite-horizon newsvendor, risk-averse variants, hydrothermal planning, and a newsvendor with secondary assembly, demonstrating tight dual bounds, favorable runtimes, and effective policy generalization compared with finite-horizon methods and randomized SDDP variants. The work contributes non-asymptotic analysis, horizon-robust exploration, and a scalable hierarchy-capable DDP toolkit, with open-source code to promote reproducibility and application in energy and supply-chain domains.

Abstract

Paper Structure (19 sections, 23 theorems, 33 equations, 3 figures, 3 tables, 5 algorithms)

This paper contains 19 sections, 23 theorems, 33 equations, 3 figures, 3 tables, 5 algorithms.

Introduction
Infinite-horizon explorative dual dynamic programming
The basic Inf-EDDP
Convergence analysis
Reducing the dependence on the effective planning horizon
Other trial point selection strategies
Selection by Largest Gap via Dual Bound
Infinite-horizon stochastic dual dynamic programming
Hierarchical dual dynamic programming
A subproblem reformulation
Convergence rates for inexact subproblem solutions
Infinite-horizon hierarchical dual dynamic programming
Numerical Experiments
Implementation details
Infinite-horizon newsvendor
...and 4 more sections

Key Result

Corollary 2.1

For any fixed $c \in \xi \in \Theta$ (e.g., $\tilde{c}_i$) and any $\bar{\epsilon} \in (0,+\infty)$, the cost function $h(\cdot, c)$ is Lipschitz continuous over $\mathcal{X}(\bar{\epsilon})$, i.e., there exists an $M_h <+\infty$ s.t. Also, $h$ is bounded over $\mathcal{X}(\bar{\epsilon})$, and in particular,

Figures (3)

Figure 1: Optimality gap convergence for different DDP-methods. Similar to previous experiments, EDDP and SDDP share the same gaps.
Figure 2: Gaps for solving infinite-horizon hydrothermal problem with $\lambda=0.8$ (left) and $\lambda=0.9906$ (right). CE-Inf-SDDP runs are repeated over 10 seeds, and the 10% and 90% quantile are shown via the shaded region and the median performance in the solid/dashed line.
Figure :

Theorems & Definitions (42)

Corollary 2.1
proof
Lemma 2.2
proof
Definition 2.1
Definition 2.2
Lemma 2.3
proof
Lemma 2.4
proof
...and 32 more

Dual dynamic programming for stochastic programs over an infinite horizon

TL;DR

Abstract

Dual dynamic programming for stochastic programs over an infinite horizon

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (42)