Table of Contents
Fetching ...

Discrete-Time Approximations of Controlled Diffusions with Infinite Horizon Discounted and Average Cost

Somnath Pradhan, Serdar Yuksel

TL;DR

This work develops a probabilistic framework for discretizing controlled diffusions with infinite-horizon discounted and ergodic costs, by constructing discrete-time Markov-chain approximations with step $h$ and corresponding interpolated controls. It proves that optimal policies derived from the discretized models are near-optimal for the continuous-time diffusion as $h\to 0$, with convergence of value functions and invariant measures under Lyapunov stability assumptions. The analysis leverages weak convergence, relaxed controls, and Lipschitz policy approximations to obtain robust near-optimality results, providing a practical route to compute high-quality controls beyond PDE-based methods. Overall, the approach complements existing PDE/probabilistic techniques and supports policy design via value iteration, convex analysis, or reinforcement learning for high-dimensional controlled diffusions.

Abstract

We present discrete-time approximation of optimal control policies for infinite horizon discounted/ergodic control problems for controlled diffusions in $\Rd$\,. In particular, our objective is to show near optimality of optimal policies designed from the approximating discrete-time controlled Markov chain model, for the discounted/ergodic optimal control problems, in the true controlled diffusion model (as the sampling period approaches zero). To this end, we first construct suitable discrete-time controlled Markov chain models for which one can compute optimal policies and optimal values via several methods (such as value iteration, convex analytic method, reinforcement learning etc.). Then using a weak convergence technique, we show that the optimal policy designed for the discrete-time Markov chain model is near-optimal for the controlled diffusion model as the discrete-time model approaches the continuous-time model. This provides a practical approach for finding near-optimal control policies for controlled diffusions. Our conditions complement existing results in the literature, which have been arrived at via either probabilistic or PDE based methods.

Discrete-Time Approximations of Controlled Diffusions with Infinite Horizon Discounted and Average Cost

TL;DR

This work develops a probabilistic framework for discretizing controlled diffusions with infinite-horizon discounted and ergodic costs, by constructing discrete-time Markov-chain approximations with step and corresponding interpolated controls. It proves that optimal policies derived from the discretized models are near-optimal for the continuous-time diffusion as , with convergence of value functions and invariant measures under Lyapunov stability assumptions. The analysis leverages weak convergence, relaxed controls, and Lipschitz policy approximations to obtain robust near-optimality results, providing a practical route to compute high-quality controls beyond PDE-based methods. Overall, the approach complements existing PDE/probabilistic techniques and supports policy design via value iteration, convex analysis, or reinforcement learning for high-dimensional controlled diffusions.

Abstract

We present discrete-time approximation of optimal control policies for infinite horizon discounted/ergodic control problems for controlled diffusions in \,. In particular, our objective is to show near optimality of optimal policies designed from the approximating discrete-time controlled Markov chain model, for the discounted/ergodic optimal control problems, in the true controlled diffusion model (as the sampling period approaches zero). To this end, we first construct suitable discrete-time controlled Markov chain models for which one can compute optimal policies and optimal values via several methods (such as value iteration, convex analytic method, reinforcement learning etc.). Then using a weak convergence technique, we show that the optimal policy designed for the discrete-time Markov chain model is near-optimal for the controlled diffusion model as the discrete-time model approaches the continuous-time model. This provides a practical approach for finding near-optimal control policies for controlled diffusions. Our conditions complement existing results in the literature, which have been arrived at via either probabilistic or PDE based methods.

Paper Structure

This paper contains 8 sections, 11 theorems, 114 equations.

Key Result

Theorem 3.1

Suppose Assumptions A1--A2 hold. Let $v^h\in \mathfrak U_{\mathsf{sm}}^h$ and $\{X_n^h\}_{n\geq 1}$ be the associated chain with initial condition $x_0^h$. If $x_0^h\to x$. Then the continuous time interpolated process $(X^h(\cdot), v^h(\cdot))$ is tight. Let $(X_{\cdot}, U_{\cdot})$ be a limit of a

Theorems & Definitions (19)

  • Remark 2.1
  • Theorem 3.1
  • proof
  • Theorem 3.2
  • Theorem 4.1
  • Remark 4.1
  • Theorem 4.2
  • Theorem 4.3
  • proof
  • Theorem 4.4
  • ...and 9 more