Table of Contents
Fetching ...

Goal-Oriented Bayesian Optimal Experimental Design for Nonlinear Models using Markov Chain Monte Carlo

Shijie Zhong, Wanggang Shen, Tommie Catanach, Xun Huan

TL;DR

This paper develops a predictive goal-oriented Bayesian optimal experimental design (GO-OED) framework for nonlinear observation and prediction models by maximizing the expected information gain on predictive QoIs. It introduces a nested Monte Carlo estimator that uses MCMC-based posterior sampling and kernel density estimation to compute the posterior-predictive density of QoIs, enabling KL-divergence based utility evaluation. The GO-OED designs are obtained via Bayesian optimization, leveraging a Gaussian process surrogate and a 99.5% upper confidence bound acquisition. Through 1D and 2D synthetic tests and a convection–diffusion sensor-placement application, the work demonstrates that GO-OED can yield designs that differ substantially from parameter-focused OED, and it discusses computational trade-offs and potential extensions.

Abstract

Optimal experimental design (OED) provides a systematic approach to quantify and maximize the value of experimental data. Under a Bayesian approach, conventional OED maximizes the expected information gain (EIG) on model parameters. However, we are often interested in not the parameters themselves, but predictive quantities of interest (QoIs) that depend on the parameters in a nonlinear manner. We present a computational framework of predictive goal-oriented OED (GO-OED) suitable for nonlinear observation and prediction models, which seeks the experimental design providing the greatest EIG on the QoIs. In particular, we propose a nested Monte Carlo estimator for the QoI EIG, featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence from the prior-predictive. The GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization. We demonstrate the effectiveness of the overall nonlinear GO-OED method, and illustrate its differences versus conventional non-GO-OED, through various test problems and an application of sensor placement for source inversion in a convection-diffusion field.

Goal-Oriented Bayesian Optimal Experimental Design for Nonlinear Models using Markov Chain Monte Carlo

TL;DR

This paper develops a predictive goal-oriented Bayesian optimal experimental design (GO-OED) framework for nonlinear observation and prediction models by maximizing the expected information gain on predictive QoIs. It introduces a nested Monte Carlo estimator that uses MCMC-based posterior sampling and kernel density estimation to compute the posterior-predictive density of QoIs, enabling KL-divergence based utility evaluation. The GO-OED designs are obtained via Bayesian optimization, leveraging a Gaussian process surrogate and a 99.5% upper confidence bound acquisition. Through 1D and 2D synthetic tests and a convection–diffusion sensor-placement application, the work demonstrates that GO-OED can yield designs that differ substantially from parameter-focused OED, and it discusses computational trade-offs and potential extensions.

Abstract

Optimal experimental design (OED) provides a systematic approach to quantify and maximize the value of experimental data. Under a Bayesian approach, conventional OED maximizes the expected information gain (EIG) on model parameters. However, we are often interested in not the parameters themselves, but predictive quantities of interest (QoIs) that depend on the parameters in a nonlinear manner. We present a computational framework of predictive goal-oriented OED (GO-OED) suitable for nonlinear observation and prediction models, which seeks the experimental design providing the greatest EIG on the QoIs. In particular, we propose a nested Monte Carlo estimator for the QoI EIG, featuring Markov chain Monte Carlo for posterior sampling and kernel density estimation for evaluating the posterior-predictive density and its Kullback-Leibler divergence from the prior-predictive. The GO-OED design is then found by maximizing the EIG over the design space using Bayesian optimization. We demonstrate the effectiveness of the overall nonlinear GO-OED method, and illustrate its differences versus conventional non-GO-OED, through various test problems and an application of sensor placement for source inversion in a convection-diffusion field.
Paper Structure (24 sections, 37 equations, 19 figures, 1 algorithm)

This paper contains 24 sections, 37 equations, 19 figures, 1 algorithm.

Figures (19)

  • Figure 1: Overview of the relationships among different variables in the GO-OED framework.
  • Figure 2: Case BM: expected utility (left) and optimized KDE bandwidth in ADBW across $d$ (right). GRID uses the gridding method for discretizing $\boldsymbol{\Theta}$ and is treated as the reference solution. ADBW and BW are the GO-OED estimators proposed in this paper, where ADBW uses adaptive bandwidth and BW uses fixed bandwidth. All methods agree on the general trends although bias and variance are noticeable for the three GO-OED methods.
  • Figure 3: Case T1: expected utility (left) and optimized KDE bandwidth in ADBW across $d$ (right). A low bandwidth leads to an overestimated EIG, a high bandwidth leads to an underestimated EIG.
  • Figure 4: Case BM, T1, T2, T3: expected utility comparisons. The benchmark case (BM, GRID) has $z=\theta$ and therefore equal to the parameter EIG. Case T1 has a nonlinear but bijective mapping from the parameter to the QoI and so has the same EIG as BM. Cases T2 and T3 are non-bijective QoIs and their EIGs are lower compared to the parameter EIG, per \ref{['e:z_bijective']}. Cases using the proposed NMC estimator each has three plot lines, corresponding to EIG estimates under different sample sizes: $(n_{\text{out}},n_{\text{in}}) = (1000,1000), (2000,2000), (3000,3000)$.
  • Figure 5: Case T3: example posterior distributions. The left plot conditions on $y$ simulated at $\theta=0.1$, and $d=0.2$ yields a narrower posterior; the right plot conditions on $y$ simulated at $\theta=0.9$, and $d=1.0$ yields a narrower posterior. The variability in information gain thus can alter the ranking of the two designs. When repeated for many samples of $\theta$ and $y$ and taking the expectation, it is the expected information gain on $\theta$ that is higher at $d=1.0$ than at $d=0.2$.
  • ...and 14 more figures