Table of Contents
Fetching ...

Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control

Jose Blanchet, Jiayi Cheng, Yuewei Ling, Hao Liu, Yang Liu

TL;DR

This work addresses diffusion-control under parameter misspecification by introducing distributionally robust Bayesian control (DRBC), which confines robustness to a KL-based perturbation of the Bayesian prior. A strong duality result recasts the inner robust prior evaluation into a low-dimensional optimization, enabling a simulation-based policy evaluation and learning framework with structured policy parameterizations. The authors establish an $O_p(n^{-1/2})$ convergence rate for the randomized multi-level Monte Carlo estimator and demonstrate DRBC's effectiveness through synthetic linear-quadratic and Bayesian Merton examples, complemented by real-data SP500 experiments that show improved out-of-sample performance and reduced pessimism. The work provides scalable offline policy evaluation tools for robust Bayesian diffusion control and suggests promising future directions toward more general ambiguity sets and high-dimensional settings.

Abstract

We study diffusion control problems under parameter uncertainty. Controllers based on plug-in estimation can be brittle due to potential distribution shifts. Bayesian control with a prior on the parameters offers a formulation with beliefs about such shifts. However, as with any Bayesian model, the prior may be misspecified. To mitigate misspecification and reduce over-pessimism compared to classical robust control approaches (e.g. \citet{hansen2008robustness}), we propose a distributionally robust Bayesian control (DRBC) formulation in which an adversary perturbs the prior within a divergence neighborhood of a baseline prior. We develop a strong duality result that reduces the distributionally robust prior evaluation to a low-dimensional optimization and yields a practical simulation-based policy evaluation and learning procedure with structured policy parameterizations. We validate the efficiency of the algorithm on a synthetic linear-quadratic control example and real-data portfolio selection.

Duality and Policy Evaluation in Distributionally Robust Bayesian Diffusion Control

TL;DR

This work addresses diffusion-control under parameter misspecification by introducing distributionally robust Bayesian control (DRBC), which confines robustness to a KL-based perturbation of the Bayesian prior. A strong duality result recasts the inner robust prior evaluation into a low-dimensional optimization, enabling a simulation-based policy evaluation and learning framework with structured policy parameterizations. The authors establish an convergence rate for the randomized multi-level Monte Carlo estimator and demonstrate DRBC's effectiveness through synthetic linear-quadratic and Bayesian Merton examples, complemented by real-data SP500 experiments that show improved out-of-sample performance and reduced pessimism. The work provides scalable offline policy evaluation tools for robust Bayesian diffusion control and suggests promising future directions toward more general ambiguity sets and high-dimensional settings.

Abstract

We study diffusion control problems under parameter uncertainty. Controllers based on plug-in estimation can be brittle due to potential distribution shifts. Bayesian control with a prior on the parameters offers a formulation with beliefs about such shifts. However, as with any Bayesian model, the prior may be misspecified. To mitigate misspecification and reduce over-pessimism compared to classical robust control approaches (e.g. \citet{hansen2008robustness}), we propose a distributionally robust Bayesian control (DRBC) formulation in which an adversary perturbs the prior within a divergence neighborhood of a baseline prior. We develop a strong duality result that reduces the distributionally robust prior evaluation to a low-dimensional optimization and yields a practical simulation-based policy evaluation and learning procedure with structured policy parameterizations. We validate the efficiency of the algorithm on a synthetic linear-quadratic control example and real-data portfolio selection.

Paper Structure

This paper contains 54 sections, 10 theorems, 156 equations, 3 figures, 8 tables, 5 algorithms.

Key Result

Theorem 2.1

$\mathcal{U}_{\mathrm{KL}}(\mu,\delta)$ is well-defined. For any $Q\in\mathcal{U}_{\mathrm{KL}}(\mu,\delta)$, there exists $\nu\ll\mu$ such that $\frac{dQ}{dP}=\frac{d\nu}{d\mu}(B)$$P\text{-a.s.}$ and $\text{KL}(Q\|P)=\text{KL}(\nu\|\mu).$

Figures (3)

  • Figure 1: Utility gap versus uncertainty radius $\delta$.
  • Figure 2: Training losses for different $b$ and $r$ values
  • Figure 3: Histogram of Sharpe Ratios under different priors.

Theorems & Definitions (28)

  • Theorem 2.1
  • Theorem 3.1
  • Remark 3.6
  • Theorem 3.7
  • Theorem 4.1
  • Definition 1.1
  • Definition 1.2
  • Definition 1.3
  • Definition 1.4
  • proof
  • ...and 18 more