A Distributions-based Approach for Data-Consistent Inversion
Kirana Bergstrom, Troy Butler, Tim Wildey
TL;DR
This paper addresses stochastic inverse problems by seeking a pullback distribution on model parameters whose QoI distribution matches an observed distribution. It introduces a distributions-based data-consistent inversion (DCI) framework that replaces density-based updates with an optimal-weighted empirical distribution function (EDF) approach and a novel binning scheme to distribute input-space weights along QoI pre-images, ensuring contour-consistent updates. The authors prove convergence results: multivariate CDF convergence implies weak convergence of measures, and the EDF-based push-forward converges to the data-observed distribution while the parameter-space pullback converges to the data-consistent update under a predictability condition. Numerical experiments on heat conduction and porous-media flow demonstrate the method’s robustness in low-data and non-density scenarios, with the binning approach achieving accurate push-forwards where density-based methods struggle. The work provides practical algorithms, theoretical guarantees, and open-source code to facilitate data-consistent inversion in complex stochastic systems.
Abstract
We formulate a novel approach to solve a class of stochastic problems, referred to as data-consistent inverse (DCI) problems, which involve the characterization of a probability measure on the parameters of a computational model whose subsequent push-forward matches an observed probability measure on specified quantities of interest (QoI) typically associated with the outputs from the computational model. Whereas prior DCI solution methodologies focused on either constructing non-parametric estimates of the densities or the probabilities of events associated with the pre-image of the QoI map, we develop and analyze a constrained quadratic optimization approach based on estimating push-forward measures using weighted empirical distribution functions. The method proposed here is more suitable for low-data regimes or high-dimensional problems than the density-based method, as well as for problems where the probability measure does not admit a density. Numerical examples are included to demonstrate the performance of the method and to compare with the density-based approach where applicable.
