Table of Contents
Fetching ...

Optimal experimental design: Formulations and computations

Xun Huan, Jayanth Jagalur, Youssef Marzouk

TL;DR

Optimal experimental design (OED) provides principled strategies for data acquisition to improve modeling and prediction. The paper surveys foundational and modern approaches, including classical alphabetic criteria, Bayesian formulations, and-infinite-dimensional/inverse problems-through information-theoretic utilities like expected information gain, with a focus on computation via nested Monte Carlo, variational bounds, and density-transport methods. It reviews both batch and sequential designs, detailing optimization algorithms (greedy, convex relaxations, stochastic/gradient-based methods) and sequential decision-making frameworks (MDPs, ADP, actor-critic methods) for adaptive experimentation. The outlook highlights challenges from model misspecification to risk-aware criteria and practical deployment, emphasizing robustness, scalability, and principled validation of OED procedures in complex scientific and engineering contexts.

Abstract

Questions of `how best to acquire data' are essential to modeling and prediction in the natural and social sciences, engineering applications, and beyond. Optimal experimental design (OED) formalizes these questions and creates computational methods to answer them. This article presents a systematic survey of modern OED, from its foundations in classical design theory to current research involving OED for complex models. We begin by reviewing criteria used to formulate an OED problem and thus to encode the goal of performing an experiment. We emphasize the flexibility of the Bayesian and decision-theoretic approach, which encompasses information-based criteria that are well-suited to nonlinear and non-Gaussian statistical models. We then discuss methods for estimating or bounding the values of these design criteria; this endeavor can be quite challenging due to strong nonlinearities, high parameter dimension, large per-sample costs, or settings where the model is implicit. A complementary set of computational issues involves optimization methods used to find a design; we discuss such methods in the discrete (combinatorial) setting of observation selection and in settings where an exact design can be continuously parameterized. Finally we present emerging methods for sequential OED that build non-myopic design policies, rather than explicit designs; these methods naturally adapt to the outcomes of past experiments in proposing new experiments, while seeking coordination among all experiments to be performed. Throughout, we highlight important open questions and challenges.

Optimal experimental design: Formulations and computations

TL;DR

Optimal experimental design (OED) provides principled strategies for data acquisition to improve modeling and prediction. The paper surveys foundational and modern approaches, including classical alphabetic criteria, Bayesian formulations, and-infinite-dimensional/inverse problems-through information-theoretic utilities like expected information gain, with a focus on computation via nested Monte Carlo, variational bounds, and density-transport methods. It reviews both batch and sequential designs, detailing optimization algorithms (greedy, convex relaxations, stochastic/gradient-based methods) and sequential decision-making frameworks (MDPs, ADP, actor-critic methods) for adaptive experimentation. The outlook highlights challenges from model misspecification to risk-aware criteria and practical deployment, emphasizing robustness, scalability, and principled validation of OED procedures in complex scientific and engineering contexts.

Abstract

Questions of `how best to acquire data' are essential to modeling and prediction in the natural and social sciences, engineering applications, and beyond. Optimal experimental design (OED) formalizes these questions and creates computational methods to answer them. This article presents a systematic survey of modern OED, from its foundations in classical design theory to current research involving OED for complex models. We begin by reviewing criteria used to formulate an OED problem and thus to encode the goal of performing an experiment. We emphasize the flexibility of the Bayesian and decision-theoretic approach, which encompasses information-based criteria that are well-suited to nonlinear and non-Gaussian statistical models. We then discuss methods for estimating or bounding the values of these design criteria; this endeavor can be quite challenging due to strong nonlinearities, high parameter dimension, large per-sample costs, or settings where the model is implicit. A complementary set of computational issues involves optimization methods used to find a design; we discuss such methods in the discrete (combinatorial) setting of observation selection and in settings where an exact design can be continuously parameterized. Finally we present emerging methods for sequential OED that build non-myopic design policies, rather than explicit designs; these methods naturally adapt to the outcomes of past experiments in proposing new experiments, while seeking coordination among all experiments to be performed. Throughout, we highlight important open questions and challenges.
Paper Structure (68 sections, 1 theorem, 173 equations, 10 figures, 1 table)

This paper contains 68 sections, 1 theorem, 173 equations, 10 figures, 1 table.

Key Result

Theorem 3.1

Let $p_{Y, \Theta}$ satisfy a subspace logarithmic Sobolev inequality with constant $\overline{C} < \infty$. Then, for any unitary matrices $V = [V_{1:r} \ V_{r+1:p}] \in \mathbb{R}^{p \times p}$ and $U = [U_{1:s} \ U_{s+1:n}] \in \mathbb{R}^{n \times n}$, we have

Figures (10)

  • Figure 2.1: Optimal sensor placement in a time-dependent advection-diffusion problem. Each figure shows a map of expected information gain (EIG) in a chosen quantity, as a function of the sensor location $\xi \in [0,1]^2$. Measurements are made at time $t_1 > 0$, and advection is towards the top right. (a) EIG in the unknown source location $\Theta$. (b) EIG for different quantities of interest $Z$, where each $Z$ is the predicted concentration at some future time $t_2 > t_1$, at the location marked by the red star. The optimal designs, maximizing EIG in each case, differ significantly.
  • Figure 3.1: Estimated EIG as a function of a scalar design parameter $\xi$ for a linear-Gaussian model, using vanilla NMC (red) or an improved multiple importance sampling scheme (green), compared to the true EIG (black). Shaded areas represent the interval containing 95% of 2000 independent estimates of EIG at each $\xi$; red dashed and solid green lines are the means of these estimates. Figure adapted from Feng_2019.
  • Figure 3.2: Variational upper (orange) and lower (blue) bounds on the EIG in a nonlinear design problem, compared to a biased estimate obtained via NMC (dashed red line). See the discussion in Section \ref{['ss:densities']}. Figure adapted from fengyi2024forthcoming.
  • Figure 5.1: In the MDP progression of sequential experimental design, we start with an initial state $s_0$ (i.e., initial prior and physical state), evaluate the policy function at the state $\mu_0(s_0)$ to obtain the design $\xi_0$ for experiment 0, conduct the experiment to obtain its outcome $y_0$ and immediate reward $r_0(s_0,\xi_0,y_0)$, update the state to the new state via the transition dynamics $s_1=\mathcal{F}_0(s_0,\xi_0,y_0)$ (i.e., updated posterior and physical state), and repeat for the next experiments. Once the last experiment $N-1$ is completed, the terminal state $s_N=\mathcal{F}_{N-1}(s_{N-1},\xi_{N-1},y_{N-1})$ can be computed along with the corresponding terminal reward $r_N(s_N)$. Figure adapted from Shen_2023.
  • Figure 5.2: A policy is a mapping from state to design. In this DNN representation of the policy, its input entails the current experiment stage, and designs of past experiments and their resulting observations. The designs and observations of future experiments that have not yet taken place are padded with zeros.
  • ...and 5 more figures

Theorems & Definitions (1)

  • Theorem 3.1