Table of Contents
Fetching ...

Clusterization in D-optimal designs: the case against linearization

Yair Daon

TL;DR

The paper analyzes Bayesian D-optimal designs for linear inverse problems in Hilbert spaces and shows that measurement clusterization is a generic outcome when there is no model error and the prior covariance has rapidly decaying eigenvalues. It provides a tractable analytic framework, proves that adding model error mitigates clustering, and characterizes D-optimal designs as focusing uncertainty reduction onto a small set of leading prior-eigenvectors. Through a nonlinear eigenvalue perspective and Carathéodory-type arguments, it explains clustering via the pigeonhole principle and offers convergence guarantees for posterior uncertainty. The work also demonstrates, with a toy 1D heat equation and numerical experiments, how clusterization arises and how correlated errors can alleviate it, highlighting implications for prior choice and model linearization in optimal design practice.

Abstract

Estimation of parameters in physical processes often demands costly measurements, prompting the pursuit of an optimal measurement strategy. Finding such strategy is termed the problem of optimal experimental design, abbreviated as optimal design. Remarkably, optimal designs can yield tightly clustered measurement locations, leading researchers to fundamentally revise the design problem just to circumvent this issue. Some authors introduce error correlation among error terms that are initially independent, while others restrict measurement locations to a finite set of locations. While both approaches may prevent clusterization, they also fundamentally alter the optimal design problem. In this study, we consider Bayesian D-optimal designs, i.e.~designs that maximize the expected Kullback-Leibler divergence between posterior and prior. We propose an analytically tractable model for D-optimal designs over Hilbert spaces. In this framework, we make several key contributions: (a) We establish that measurement clusterization is a generic trait of D-optimal designs for linear inverse problems with independent Gaussian measurement errors and a Gaussian prior. (b) We prove that introducing correlations among measurement error terms mitigates clusterization. (c) We characterize D-optimal designs as reducing uncertainty across a subset of prior covariance eigenvectors. (d) We leverage this characterization to argue that measurement clusterization arises as a consequence of the pigeonhole principle: when more measurements are taken than there are locations where the select eigenvectors are large and others are small -- clusterization occurs. Finally, we use our analysis to argue against the use of Gaussian priors with linearized physical models when seeking a D-optimal design.

Clusterization in D-optimal designs: the case against linearization

TL;DR

The paper analyzes Bayesian D-optimal designs for linear inverse problems in Hilbert spaces and shows that measurement clusterization is a generic outcome when there is no model error and the prior covariance has rapidly decaying eigenvalues. It provides a tractable analytic framework, proves that adding model error mitigates clustering, and characterizes D-optimal designs as focusing uncertainty reduction onto a small set of leading prior-eigenvectors. Through a nonlinear eigenvalue perspective and Carathéodory-type arguments, it explains clustering via the pigeonhole principle and offers convergence guarantees for posterior uncertainty. The work also demonstrates, with a toy 1D heat equation and numerical experiments, how clusterization arises and how correlated errors can alleviate it, highlighting implications for prior choice and model linearization in optimal design practice.

Abstract

Estimation of parameters in physical processes often demands costly measurements, prompting the pursuit of an optimal measurement strategy. Finding such strategy is termed the problem of optimal experimental design, abbreviated as optimal design. Remarkably, optimal designs can yield tightly clustered measurement locations, leading researchers to fundamentally revise the design problem just to circumvent this issue. Some authors introduce error correlation among error terms that are initially independent, while others restrict measurement locations to a finite set of locations. While both approaches may prevent clusterization, they also fundamentally alter the optimal design problem. In this study, we consider Bayesian D-optimal designs, i.e.~designs that maximize the expected Kullback-Leibler divergence between posterior and prior. We propose an analytically tractable model for D-optimal designs over Hilbert spaces. In this framework, we make several key contributions: (a) We establish that measurement clusterization is a generic trait of D-optimal designs for linear inverse problems with independent Gaussian measurement errors and a Gaussian prior. (b) We prove that introducing correlations among measurement error terms mitigates clusterization. (c) We characterize D-optimal designs as reducing uncertainty across a subset of prior covariance eigenvectors. (d) We leverage this characterization to argue that measurement clusterization arises as a consequence of the pigeonhole principle: when more measurements are taken than there are locations where the select eigenvectors are large and others are small -- clusterization occurs. Finally, we use our analysis to argue against the use of Gaussian priors with linearized physical models when seeking a D-optimal design.

Paper Structure

This paper contains 24 sections, 15 theorems, 45 equations, 5 figures.

Key Result

Theorem 2.1

Let $\mu_{\textup{pr}} = \mathcal{N}(\mathbf{m}_{\textup{pr}},\Gamma_{\textup{pr}})$ be a Gaussian prior on $\mathcal{H}_p$ and let $\mu_{\textup{post}} = \mathcal{N}(\mathbf{m}_{\textup{post}},\Gamma_{\textup{post}})$ the posterior measure on $\mathcal{H}_p$ for the Bayesian linear inverse problem

Figures (5)

  • Figure 1: Measurement clusterization in D-optimal designs for the inverse problem of the 1D heat equation. Measurement locations were chosen according to the Bayesian D-optimality criterion of Theorem \ref{['thm:d_optimality']}. Measurement locations are plotted over the computational domain $\Omega = [0, 1]$ (x-axis), for varying numbers of measurements (y-axis). The colored numbers are measurement indices, plotted for visual clarity. Measurement clusterization already occurs for three measurements: the second measurement (red) is overlaid on the third (green). For five measurements, first (blue) and second (red) measurements are clustered, as well as the fourth (black) and the fifth (magenta).
  • Figure 2: A comparison of the eigenvalues of the pushforward posterior precision $(\mathcal{F}\Gamma_{\textup{pr}}\mathcal{F}^*)^{-1} + \sigma^{-2}\mathcal{O}^*\mathcal{O}$ for a D-optimal design (left) and a sub-optimal design (right). Both designs are allowed $m=3$ measurements. We assume $\sigma^2=1$ and thus, the blue area has accumulated height of $\sigma^{-2}m = 3$ in both panels. The D-optimal design (left) increases precision where it is lowest. The sub-optimal design (right) does not.
  • Figure 3: D-optimal measurement locations ($m=4$ measurements) and weighted eigenvectors for finding the initial condition of the 1D heat equation. Measurement locations and weighted eigenvectors are plotted over the computational domain $\Omega = [0, 1]$ (x-axis). Measurement clusterization occurs approximately at $0.31$ and $0.69$. These two locations are a compromise between the amplitudes of the first and second eigenvectors, which are the eigenvectors that a D-optimal design aims to measure. Allocating $m=4$ measurements into two locations results in clusterization, according to the pigeonhole principle.
  • Figure 4: Fraction of clustered $A$ for $AA^t = M$ and $M$ generated randomly (see text and repository for details on generating $M$). It is evident that when $m-k > 1$ clusterization is prevalent, whereas for lower $m-k$ clusterization is not.
  • Figure 5: Model correlation mitigates clusterization. We add a model correlation term to the error terms in the 1D heat equation inverse problem. As expected, measurements are pushed away owing to the model error term.

Theorems & Definitions (22)

  • Theorem 2.1: Alexanderian, Gloor, Ghattas AlexanderianGloorGhattas14
  • Definition 2.2
  • Definition 3.1
  • Proposition 3.2
  • Proposition 3.3
  • proof
  • Proposition 3.4
  • Theorem 3.5: Necessary conditions for D-Optimality
  • Proposition 4.1: Increase due to a measurement
  • Corollary 4.2
  • ...and 12 more