Dynamic Programs on Partially Ordered Sets

Thomas J. Sargent; John Stachurski

Dynamic Programs on Partially Ordered Sets

Thomas J. Sargent, John Stachurski

TL;DR

This paper develops a unifying abstract dynamic programming (ADP) framework that generalizes beyond standard Markov decision processes by modeling lifetime values as fixed points of order-preserving policy operators on a partially ordered value space $V$. By introducing regularity and well-posedness concepts and defining the Bellman operator $T=\bigvee_{\sigma} T_\sigma$, the authors prove fundamental optimality results and convergence guarantees for various DP variants, including distributional, empirical, risk-sensitive, approximate, and structural estimation models. The framework accommodates value spaces beyond real-valued functions, such as distributions and random function spaces, enabling analysis of nonstandard objectives and function-approximation schemes. The paper also demonstrates concrete applications to non-EU discrete choice and firm valuation, deriving conditions under which standard algorithms (VFI, OPI, HPI) converge and providing insight into when optimal policies exist and are $v^*$-greedy.

Abstract

We introduce a framework that represents a dynamic program as a family of operators acting on a partially ordered set. We provide an optimality theory based only on order-theoretic assumptions and show how applications across almost all subfields of dynamic programming fit into this framework. These range from traditional dynamic programs to those involving nonlinear recursive preferences, desire for robustness, function approximation, Monte Carlo sampling and distributional dynamic programs. We apply the framework to establish new optimality and algorithmic results for specific applications.

Dynamic Programs on Partially Ordered Sets

TL;DR

. By introducing regularity and well-posedness concepts and defining the Bellman operator

, the authors prove fundamental optimality results and convergence guarantees for various DP variants, including distributional, empirical, risk-sensitive, approximate, and structural estimation models. The framework accommodates value spaces beyond real-valued functions, such as distributions and random function spaces, enabling analysis of nonstandard objectives and function-approximation schemes. The paper also demonstrates concrete applications to non-EU discrete choice and firm valuation, deriving conditions under which standard algorithms (VFI, OPI, HPI) converge and providing insight into when optimal policies exist and are

-greedy.

Abstract

Paper Structure (21 sections, 17 theorems, 30 equations, 1 figure)

This paper contains 21 sections, 17 theorems, 30 equations, 1 figure.

Introduction
Preliminaries
Abstract Dynamic Programs
Example: MDPs
Example: Risk-Sensitive Q-learning
Distributional Dynamic Programming
Empirical Dynamic Programming
Example: Approximate Dynamic Programming
Example: Structural Estimation
Properties of ADPs
Basic Properties
Defining Optimality
Algorithms
Optimality Results
Proofs of Section \ref{['ss:sr']} Results
...and 6 more sections

Key Result

Theorem 2.1

The set $\mathop{\mathrm{fix}}\nolimits(S)$ is nonempty if either Moreover, in the second case, $v \in V$ and $v \preceq Sv$ implies $\bigvee_n S^n v \in \mathop{\mathrm{fix}}\nolimits(S)$.

Figures (1)

Figure 1: Firm value function and exit threshold

Theorems & Definitions (41)

Theorem 2.1
proof
Example 2.1
Lemma 2.2
proof
Example 3.1
Lemma 3.1
proof
Lemma 3.2
proof
...and 31 more

Dynamic Programs on Partially Ordered Sets

TL;DR

Abstract

Dynamic Programs on Partially Ordered Sets

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (41)