Bayesian optimal experimental design with Wasserstein information criteria
Tapio Helin, Youssef Marzouk, Jose Rodrigo Rojo-Garcia
TL;DR
This work introduces a Wasserstein-distance-based utility for Bayesian optimal experimental design, defining U_p(θ) = E^{π(y;θ)}[W_p^p(μ, μ^y)] to guide design choices. In Gaussian-linear inverse problems, U_2 admits a closed form, linking design quality to the geometric mean of prior and posterior covariances, and the framework provides a transport-cost interpretation of information gain. The authors establish rigorous stability bounds for Wasserstein-1 under perturbations of the likelihood and prior, with partial results for Wasserstein-2, and develop computable schemes that demonstrate convergence and practical viability through simulations and two illustrative examples. The approach is well-suited to high-dimensional and infinite-dimensional settings, offering robust design criteria that remain meaningful when supports do not overlap and enabling scalable computations via optimal transport techniques.
Abstract
Bayesian optimal experimental design (OED) provides a principled framework for selecting the most informative observational settings in experiments. With rapid advances in computational power, Bayesian OED has become increasingly feasible for inference problems involving large-scale simulations, attracting growing interest in fields such as inverse problems. In this paper, we introduce a novel design criterion based on the expected Wasserstein-$p$ distance between the prior and posterior distributions. Especially, for $p=2$, this criterion shares key parallels with the widely used expected information gain (EIG), which relies on the Kullback--Leibler divergence instead. First, the Wasserstein-2 criterion admits a closed-form solution for Gaussian regression, a property which can be also leveraged for approximative schemes. Second, it can be interpreted as maximizing the information gain measured by the transport cost incurred when updating the prior to the posterior. Our main contribution is a stability analysis of the Wasserstein-1 criterion, where we provide a rigorous error analysis under perturbations of the prior or likelihood. We partially extend this study also to the Wasserstein-2 criterion. In particular, these results yield error rates when empirical approximations of priors are used. Finally, we demonstrate the computability of the Wasserstein-2 criterion and demonstrate our approximation rates through simulations.
