Table of Contents
Fetching ...

Data-Driven DRO and Economic Decision Theory: An Analytical Synthesis With Bayesian Nonparametric Advancements

Nicola Bariletto, Khai Nguyen, Nhat Ho

TL;DR

This work builds a rigorous bridge between data-driven DRO and Economic Decision Theory under Ambiguity by showing how regularization and DRO arise as data-driven ambivalence models. It introduces a Bayesian nonparametric framework, grounded in Dirichlet Process priors and extended to Hierarchical Dirichlet Processes, to model distributional uncertainty across heterogeneous sources and to borrow strength while preserving tractability. A smooth ambiguity-averse extension provides a softer alternative to worst-case DRO, and the paper develops Monte Carlo approximations, performance guarantees, and gradient-based optimization methods. Empirical results on synthetic and real data demonstrate improved predictive accuracy and stability, while an outlier-robust variant (DORO) mitigates sensitivity to anomalous observations. Overall, the synthesis offers a principled, scalable approach to robust learning that blends decision-theoretic insights with flexible Bayesian nonparametric modeling, enabling robust decisions under complex data-driven uncertainty.

Abstract

We develop an analytical synthesis that bridges data-driven Distributionally Robust Optimization (DRO) and Economic Decision Theory under Ambiguity (DTA). By reinterpreting standard regularization and DRO techniques as data-driven counterparts of ambiguity-averse decision models, we provide a unified framework that clarifies their intrinsic connections. Building on this synthesis, we propose a novel DRO approach that leverages a popular DTA model of smooth ambiguity-averse preferences together with tools from Bayesian nonparametric statistics. Our baseline framework employs Dirichlet Process (DP) posteriors, which naturally extend to heterogeneous data sources via Hierarchical Dirichlet Processes (HDPs), and can be further refined to induce outlier robustness through a procedure that selectively filters poorly-fitting observations during training. Theoretical performance guarantees and convergence results, together with extensive simulations and real-data experiments, illustrate the method's favorable performance in terms of prediction accuracy and stability.

Data-Driven DRO and Economic Decision Theory: An Analytical Synthesis With Bayesian Nonparametric Advancements

TL;DR

This work builds a rigorous bridge between data-driven DRO and Economic Decision Theory under Ambiguity by showing how regularization and DRO arise as data-driven ambivalence models. It introduces a Bayesian nonparametric framework, grounded in Dirichlet Process priors and extended to Hierarchical Dirichlet Processes, to model distributional uncertainty across heterogeneous sources and to borrow strength while preserving tractability. A smooth ambiguity-averse extension provides a softer alternative to worst-case DRO, and the paper develops Monte Carlo approximations, performance guarantees, and gradient-based optimization methods. Empirical results on synthetic and real data demonstrate improved predictive accuracy and stability, while an outlier-robust variant (DORO) mitigates sensitivity to anomalous observations. Overall, the synthesis offers a principled, scalable approach to robust learning that blends decision-theoretic insights with flexible Bayesian nonparametric modeling, enabling robust decisions under complex data-driven uncertainty.

Abstract

We develop an analytical synthesis that bridges data-driven Distributionally Robust Optimization (DRO) and Economic Decision Theory under Ambiguity (DTA). By reinterpreting standard regularization and DRO techniques as data-driven counterparts of ambiguity-averse decision models, we provide a unified framework that clarifies their intrinsic connections. Building on this synthesis, we propose a novel DRO approach that leverages a popular DTA model of smooth ambiguity-averse preferences together with tools from Bayesian nonparametric statistics. Our baseline framework employs Dirichlet Process (DP) posteriors, which naturally extend to heterogeneous data sources via Hierarchical Dirichlet Processes (HDPs), and can be further refined to induce outlier robustness through a procedure that selectively filters poorly-fitting observations during training. Theoretical performance guarantees and convergence results, together with extensive simulations and real-data experiments, illustrate the method's favorable performance in terms of prediction accuracy and stability.
Paper Structure (71 sections, 12 theorems, 109 equations, 10 figures, 6 tables, 5 algorithms)

This paper contains 71 sections, 12 theorems, 109 equations, 10 figures, 6 tables, 5 algorithms.

Key Result

Proposition 1

Assume each data point $\xi=(y,x)\in\mathbb R^{d+1}$ consists of a response $y\in\mathbb R$ and a vector of inputs $x\in\mathbb R^d$, with $h(\theta, \xi) = (y-x^\top \theta)^2$ for $\theta\in\mathbb R^d$. Choosing $Q_{\boldsymbol{\xi}^n}$ as a DP with concentration parameter $\alpha + n$ and center

Figures (10)

  • Figure 1: An illustration of smooth ambiguity-averse preferences. Although $f_1$ and $f_2$ yield the same expected utility $U_\star$ across the ambiguous distributions $Q_1$ and $Q_2$, the ambiguity-averse criterion $V_3(f)$ implies a strict preference for the act $f_2$, which delivers more stable utility levels across these distributions.
  • Figure 2: Key components of the HDP-based DRO method. The criterion for group $s$ is shaped by: (i) the empirical risk within group $s$ (right box); (ii) a borrowing strength component from the risk across groups (bottom box); (iii) a regularization component from the prior centering measure (left box); and (iv) a distributionally robust component governed by the curvature of $\phi$. Hyperparameters near the arrows adjust each component's influence.
  • Figure 3: Simulation results for the high-dimensional sparse linear regression experiment. Bars report the mean and standard deviation (across 200 sample simulations) of the test RMSE, $L^2$ distance of the estimated coefficient vector $\hat{\theta}$ from the data-generating one, and the $L^2$ norm of $\hat{\theta}$. Results are shown for the ambiguity-averse, ambiguity-neutral, and OLS procedures. Note: The left (blue) axis refers to mean values, the right (orange) axis to standard deviation values.
  • Figure 4: Comparison of out-of-sample performance and estimation accuracy of different methods in high-dimensional regression experiments. The robust HDP method (in bright red) outperforms the others both in terms of average performance and variability. Note: Distance from truth is measured as the squared $L^2$ distance between the estimated coefficients and the data-generating ones.
  • Figure 5: Loss dynamics of DP-based DRO and DORO routines.
  • ...and 5 more figures

Theorems & Definitions (15)

  • Proposition 1: bariletto2024bayesian
  • Remark 2
  • Example 1
  • Lemma 3
  • Theorem 4
  • Theorem 5
  • Theorem 6
  • Proposition 7: camerlenghi2019distribution, Thms. 9 and 10, Ex. 5
  • Remark 8
  • Theorem 9
  • ...and 5 more