Data-Driven DRO and Economic Decision Theory: An Analytical Synthesis With Bayesian Nonparametric Advancements
Nicola Bariletto, Khai Nguyen, Nhat Ho
TL;DR
This work builds a rigorous bridge between data-driven DRO and Economic Decision Theory under Ambiguity by showing how regularization and DRO arise as data-driven ambivalence models. It introduces a Bayesian nonparametric framework, grounded in Dirichlet Process priors and extended to Hierarchical Dirichlet Processes, to model distributional uncertainty across heterogeneous sources and to borrow strength while preserving tractability. A smooth ambiguity-averse extension provides a softer alternative to worst-case DRO, and the paper develops Monte Carlo approximations, performance guarantees, and gradient-based optimization methods. Empirical results on synthetic and real data demonstrate improved predictive accuracy and stability, while an outlier-robust variant (DORO) mitigates sensitivity to anomalous observations. Overall, the synthesis offers a principled, scalable approach to robust learning that blends decision-theoretic insights with flexible Bayesian nonparametric modeling, enabling robust decisions under complex data-driven uncertainty.
Abstract
We develop an analytical synthesis that bridges data-driven Distributionally Robust Optimization (DRO) and Economic Decision Theory under Ambiguity (DTA). By reinterpreting standard regularization and DRO techniques as data-driven counterparts of ambiguity-averse decision models, we provide a unified framework that clarifies their intrinsic connections. Building on this synthesis, we propose a novel DRO approach that leverages a popular DTA model of smooth ambiguity-averse preferences together with tools from Bayesian nonparametric statistics. Our baseline framework employs Dirichlet Process (DP) posteriors, which naturally extend to heterogeneous data sources via Hierarchical Dirichlet Processes (HDPs), and can be further refined to induce outlier robustness through a procedure that selectively filters poorly-fitting observations during training. Theoretical performance guarantees and convergence results, together with extensive simulations and real-data experiments, illustrate the method's favorable performance in terms of prediction accuracy and stability.
