Table of Contents
Fetching ...

A Distributions-based Approach for Data-Consistent Inversion

Kirana Bergstrom, Troy Butler, Tim Wildey

TL;DR

This paper addresses stochastic inverse problems by seeking a pullback distribution on model parameters whose QoI distribution matches an observed distribution. It introduces a distributions-based data-consistent inversion (DCI) framework that replaces density-based updates with an optimal-weighted empirical distribution function (EDF) approach and a novel binning scheme to distribute input-space weights along QoI pre-images, ensuring contour-consistent updates. The authors prove convergence results: multivariate CDF convergence implies weak convergence of measures, and the EDF-based push-forward converges to the data-observed distribution while the parameter-space pullback converges to the data-consistent update under a predictability condition. Numerical experiments on heat conduction and porous-media flow demonstrate the method’s robustness in low-data and non-density scenarios, with the binning approach achieving accurate push-forwards where density-based methods struggle. The work provides practical algorithms, theoretical guarantees, and open-source code to facilitate data-consistent inversion in complex stochastic systems.

Abstract

We formulate a novel approach to solve a class of stochastic problems, referred to as data-consistent inverse (DCI) problems, which involve the characterization of a probability measure on the parameters of a computational model whose subsequent push-forward matches an observed probability measure on specified quantities of interest (QoI) typically associated with the outputs from the computational model. Whereas prior DCI solution methodologies focused on either constructing non-parametric estimates of the densities or the probabilities of events associated with the pre-image of the QoI map, we develop and analyze a constrained quadratic optimization approach based on estimating push-forward measures using weighted empirical distribution functions. The method proposed here is more suitable for low-data regimes or high-dimensional problems than the density-based method, as well as for problems where the probability measure does not admit a density. Numerical examples are included to demonstrate the performance of the method and to compare with the density-based approach where applicable.

A Distributions-based Approach for Data-Consistent Inversion

TL;DR

This paper addresses stochastic inverse problems by seeking a pullback distribution on model parameters whose QoI distribution matches an observed distribution. It introduces a distributions-based data-consistent inversion (DCI) framework that replaces density-based updates with an optimal-weighted empirical distribution function (EDF) approach and a novel binning scheme to distribute input-space weights along QoI pre-images, ensuring contour-consistent updates. The authors prove convergence results: multivariate CDF convergence implies weak convergence of measures, and the EDF-based push-forward converges to the data-observed distribution while the parameter-space pullback converges to the data-consistent update under a predictability condition. Numerical experiments on heat conduction and porous-media flow demonstrate the method’s robustness in low-data and non-density scenarios, with the binning approach achieving accurate push-forwards where density-based methods struggle. The work provides practical algorithms, theoretical guarantees, and open-source code to facilitate data-consistent inversion in complex stochastic systems.

Abstract

We formulate a novel approach to solve a class of stochastic problems, referred to as data-consistent inverse (DCI) problems, which involve the characterization of a probability measure on the parameters of a computational model whose subsequent push-forward matches an observed probability measure on specified quantities of interest (QoI) typically associated with the outputs from the computational model. Whereas prior DCI solution methodologies focused on either constructing non-parametric estimates of the densities or the probabilities of events associated with the pre-image of the QoI map, we develop and analyze a constrained quadratic optimization approach based on estimating push-forward measures using weighted empirical distribution functions. The method proposed here is more suitable for low-data regimes or high-dimensional problems than the density-based method, as well as for problems where the probability measure does not admit a density. Numerical examples are included to demonstrate the performance of the method and to compare with the density-based approach where applicable.
Paper Structure (26 sections, 8 theorems, 29 equations, 19 figures, 3 algorithms)

This paper contains 26 sections, 8 theorems, 29 equations, 19 figures, 3 algorithms.

Key Result

Theorem 2.1

\newlabelthm:disintegration0 [Disintegration Theorem] Assume $Q : \Lambda \rightarrow \mathcal{D}$ is $\mathcal{B}_\Lambda$-measurable, $P_\Lambda$ is a probability measure on $(\Lambda ,\mathcal{B}_\Lambda)$, and $P_\mathcal{D}$ is the push-forward measure of $P_\Lambda$ on $(\mathcal{D} ,\mathc so $P_{\bm{q}}(\Lambda \setminus Q^{-1}(\bm{q})) = 0$, and there exists the following disintegrati

Figures (19)

  • Figure 1: Contours for the map $Q$ in the illustrative example. Note the distinct "contour sets" whose probabilities are uniquely defined by $P_{\text{obs}}$. However, the probabilities within these contour sets cannot be determined by $P_{\text{obs}}$.
  • Figure 1: Data space comparisons of the optimal $L^2$ re-weighting scheme with the density-based approach for the illustrative example when $n=2E3$, $m=1E4$. On the left we plot the EDFs, on the right we plot the estimated densities.
  • Figure 1: Sets $A$ and $Q^{-1}(B)$ in $\Lambda$ on the left. Set $B$ on the right.
  • Figure 2: Predicted and observed histograms for the illustrative example and their estimated KDEs. We also show an alternate possible observed distribution, labeled "Pred. violation observed" because for this example, Assumption \ref{['ass:predictability']} is violated: this observed distribution is not absolutely continuous w.r.t. the predicted.
  • Figure 2: Weights resulting from naïve distributions-based method applied to initial samples on the left. Direct comparison to density-based weights on the right with the line indicating where perfect agreement occurs.
  • ...and 14 more figures

Theorems & Definitions (18)

  • Theorem 2.1
  • Definition 3.1: Binning Distribution
  • Definition 4.1
  • Lemma 4.2
  • Proof 1
  • Lemma 4.3
  • Proof 2
  • Theorem 4.4
  • Proof 3
  • Corollary 4.5
  • ...and 8 more