Non-parametric Bayesian inference via loss functions under model misspecification
Yu Luo, David A. Stephens, Daniel J. Graham, Emma J. McCoy
TL;DR
This work develops a non-parametric Bayesian framework for inference when the data-generating process may be misspecified, by formulating inference as loss minimization and constructing Gibbs/posterior distributions via Dirichlet-process–based Bayesian non-parametric updating. It presents two computationally efficient, optimization-based updating schemes (prior-to-posterior and predictive-to-posterior) that converge to the same posterior under exchangeability and are shown to be consistent and asymptotically normal when misspecification is present. A calibration procedure for the learning rate ensures valid uncertainty quantification, and the authors apply the methods to causal inference via propensity-score regression, demonstrating doubly robust properties and robust performance in simulations and an empirical UK speed camera study. The results offer scalable, robust alternatives to traditional likelihood-based Bayesian updating in settings where model misspecification is a concern, with broad applicability to causal inference and predictive decision-making.
Abstract
In the usual Bayesian setting, a full probabilistic model is required to link the data and parameters, and the form of this model and the inference and prediction mechanisms are specified via de Finetti's representation. In general, such a formulation is not robust to model misspecification of its component parts. An alternative approach is to draw inference based on loss functions, where the quantity of interest is defined as a minimizer of some expected loss, and to construct posterior distributions based on the loss-based formulation; this strategy underpins the construction of the Gibbs posterior. We develop a Bayesian non-parametric approach; specifically, we generalize the Bayesian bootstrap, and specify a Dirichlet process model for the distribution of the observables. We implement this using direct prior-to-posterior calculations, but also using predictive sampling. We also study the assessment of posterior validity for non-standard Bayesian calculations. We show that the developed non-standard Bayesian updating procedures yield valid posterior distributions in terms of consistency and asymptotic normality under model misspecification. Simulation studies show that the proposed methods can recover the true value of the parameter under misspecification.
