Approximation of differential entropy in Bayesian optimal experimental design
Chuntao Chen, Tapio Helin, Nuutti Hyvönen, Yuya Suzuki
TL;DR
The authors tackle the challenge of Bayesian optimal experimental design by focusing on the entropy of the evidence distribution, J(ξ) = Ent(π(·;ξ)), which arises when the likelihood's entropy is design-independent or evaluable. They propose a two-step, scalable approach that builds a fast Gaussian-mixture surrogate π_M^K(y) from M prior samples and a surrogate forward map G_K, then estimates Ent(π_M^K) with standard Monte Carlo or Quasi-Monte Carlo methods. Theoretical results show convergence rates of the RMSE as a function of δ_K (forward-model error) and the sample sizes M and N, with accelerated rates under QMC in the uniform-prior setting, and extensions to Gaussian priors. Numerical experiments on deconvolution and an elliptic PDE with random diffusion coefficients confirm the predicted rates and demonstrate substantial reductions in forward-model evaluations, illustrating the approach's scalability to large-scale inverse problems.
Abstract
Bayesian optimal experimental design provides a principled framework for selecting experimental settings that maximize obtained information. In this work, we focus on estimating the expected information gain in the setting where the differential entropy of the likelihood is either independent of the design or can be evaluated explicitly. This reduces the problem to maximum entropy estimation, alleviating several challenges inherent in expected information gain computation. Our study is motivated by large-scale inference problems, such as inverse problems, where the computational cost is dominated by expensive likelihood evaluations. We propose a computational approach in which the evidence density is approximated by a Monte Carlo or quasi-Monte Carlo surrogate, while the differential entropy is evaluated using standard methods without additional likelihood evaluations. We prove that this strategy achieves convergence rates that are comparable to, or better than, state-of-the-art methods for full expected information gain estimation, particularly when the cost of entropy evaluation is negligible. Moreover, our approach relies only on mild smoothness of the forward map and avoids stronger technical assumptions required in earlier work. We also present numerical experiments, which confirm our theoretical findings.
