Table of Contents
Fetching ...

Multi-fidelity Parameter Estimation Using Conditional Diffusion Models

Caroline Tatsuoka, Minglei Yang, Dongbin Xiu, Guannan Zhang

TL;DR

The paper addresses the challenge of Bayesian parameter estimation for computationally expensive forward models by introducing a multi-fidelity framework that combines a low-fidelity conditional diffusion model for amortized posterior sampling with a high-fidelity refinement step. The low-fidelity model $G^{ m low}(oldsymbol{y},oldsymbol{z})$ delivers fast posterior approximations across data, while the high-fidelity model $G^{ m high}(oldsymbol{z})$ provides accurate density refinements for specific observations, guided by KDE-based density estimates and selective solving of the expensive forward model. Both models leverage a training-free score-based diffusion approach to map between a standard Gaussian latent space and the target posterior $p_{oldsymbol{ heta}|oldsymbol{Y}}(oldsymbol{ heta}|oldsymbol{y})$, enabling efficient sampling without repeated high-cost forward solves. The method is demonstrated on diverse problems, including multi-modal and symmetric posteriors, a Burgers PDE, the chaotic Lorenz system, a linear SDE, and a plasma physics RUNAWAY electron model, showing substantial computational speedups with accurate posterior representations. Overall, the framework offers a practical, adaptive, and scalable route for efficient uncertainty quantification in high-fidelity, multi-physics applications.

Abstract

We present a multi-fidelity method for uncertainty quantification of parameter estimates in complex systems, leveraging generative models trained to sample the target conditional distribution. In the Bayesian inference setting, traditional parameter estimation methods rely on repeated simulations of potentially expensive forward models to determine the posterior distribution of the parameter values, which may result in computationally intractable workflows. Furthermore, methods such as Markov Chain Monte Carlo (MCMC) necessitate rerunning the entire algorithm for each new data observation, further increasing the computational burden. Hence, we propose a novel method for efficiently obtaining posterior distributions of parameter estimates for high-fidelity models given data observations of interest. The method first constructs a low-fidelity, conditional generative model capable of amortized Bayesian inference and hence rapid posterior density approximation over a wide-range of data observations. When higher accuracy is needed for a specific data observation, the method employs adaptive refinement of the density approximation. It uses outputs from the low-fidelity generative model to refine the parameter sampling space, ensuring efficient use of the computationally expensive high-fidelity solver. Subsequently, a high-fidelity, unconditional generative model is trained to achieve greater accuracy in the target posterior distribution. Both low- and high- fidelity generative models enable efficient sampling from the target posterior and do not require repeated simulation of the high-fidelity forward model. We demonstrate the effectiveness of the proposed method on several numerical examples, including cases with multi-modal densities, as well as an application in plasma physics for a runaway electron simulation model.

Multi-fidelity Parameter Estimation Using Conditional Diffusion Models

TL;DR

The paper addresses the challenge of Bayesian parameter estimation for computationally expensive forward models by introducing a multi-fidelity framework that combines a low-fidelity conditional diffusion model for amortized posterior sampling with a high-fidelity refinement step. The low-fidelity model delivers fast posterior approximations across data, while the high-fidelity model provides accurate density refinements for specific observations, guided by KDE-based density estimates and selective solving of the expensive forward model. Both models leverage a training-free score-based diffusion approach to map between a standard Gaussian latent space and the target posterior , enabling efficient sampling without repeated high-cost forward solves. The method is demonstrated on diverse problems, including multi-modal and symmetric posteriors, a Burgers PDE, the chaotic Lorenz system, a linear SDE, and a plasma physics RUNAWAY electron model, showing substantial computational speedups with accurate posterior representations. Overall, the framework offers a practical, adaptive, and scalable route for efficient uncertainty quantification in high-fidelity, multi-physics applications.

Abstract

We present a multi-fidelity method for uncertainty quantification of parameter estimates in complex systems, leveraging generative models trained to sample the target conditional distribution. In the Bayesian inference setting, traditional parameter estimation methods rely on repeated simulations of potentially expensive forward models to determine the posterior distribution of the parameter values, which may result in computationally intractable workflows. Furthermore, methods such as Markov Chain Monte Carlo (MCMC) necessitate rerunning the entire algorithm for each new data observation, further increasing the computational burden. Hence, we propose a novel method for efficiently obtaining posterior distributions of parameter estimates for high-fidelity models given data observations of interest. The method first constructs a low-fidelity, conditional generative model capable of amortized Bayesian inference and hence rapid posterior density approximation over a wide-range of data observations. When higher accuracy is needed for a specific data observation, the method employs adaptive refinement of the density approximation. It uses outputs from the low-fidelity generative model to refine the parameter sampling space, ensuring efficient use of the computationally expensive high-fidelity solver. Subsequently, a high-fidelity, unconditional generative model is trained to achieve greater accuracy in the target posterior distribution. Both low- and high- fidelity generative models enable efficient sampling from the target posterior and do not require repeated simulation of the high-fidelity forward model. We demonstrate the effectiveness of the proposed method on several numerical examples, including cases with multi-modal densities, as well as an application in plasma physics for a runaway electron simulation model.

Paper Structure

This paper contains 16 sections, 38 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 3.1: Illustration of the proposed multi-fidelity parameter estimation framework. The low-fidelity generative model efficiently approximates the posterior distribution $p_{\Theta|Y}(\theta | y)$ for any observation $y \sim p_Y(y)$, offering substantial computational advantages over traditional MCMC sampling which requires new runs for each observation. When $G^{\rm low}(y,z)$ cannot achieve the desired accuracy for a specific observation, it can still guide the sampling process by identifying high-probability regions of $p_{\Theta|Y}(\theta | y)$. These samples then serve as additional training data for constructing the more accurate high-fidelity generative model $G^{\rm high}(z)$.
  • Figure 4.1: The condition distribution $p_{\Theta|Y}(\theta|y)$ for $y=1$. Left panel shows the density approximation obtained via the low-fidelity generative model $G^{\rm low}$. The KL divergence, computed according to Eq. \ref{['eq_kl']} using a uniform mesh (1000 samples over $[-2,2]$), is $2.118$. The middle panel displays the density approximation derived from labeled data by solving the reverse-time ODE model, yielding a KL divergence of 2.32e-03. The right panel presents the density approximation generated by the high-fidelity generative model $G^{\rm high}$, which achieves a KL divergence of 2.23e-03. The small KL divergence values for both the ODE solution and $G^{\rm high}$ indicate excellent approximation performance, with $G^{\rm high}$ demonstrating particularly strong fidelity to the true density.
  • Figure 4.2: The condition distribution $p_{\Theta|Y}(\theta|y)$ for $y=9$. Left panel shows the density approximation obtained via the low-fidelity generative model $G^{\rm low}$. The KL divergence, computed according to Eq. \ref{['eq_kl']} using a uniform mesh (1000 samples over $[-4,4]$), is $3.341$. The middle panel displays the density approximation derived from labeled data by solving the reverse-time ODE model, yielding a KL divergence of 1.22e-02. The right panel presents the density approximation generated by the high-fidelity generative model $G^{\rm high}$, which achieves a KL divergence of 2.78e-02. The small KL divergence values for both the ODE solution and $G^{\rm high}$ indicate excellent approximation performance, with $G^{\rm high}$ demonstrating particularly strong fidelity to the true density.
  • Figure 4.3: The conditional distribution $p_{\Theta|Y}(\theta|\bm{y})$ for $\bm{y}_{|\theta = 0.05}$ and $\bm{y}_{|\theta = 0.07}$. For observation $\bm{y}_{|\theta = 0.05}$ (top row): The left panel shows the density approximation obtained via the low-fidelity generative model $G^{\rm low}$. The KL divergence, computed using a uniform mesh (1000 samples over $[0.04,0.06]$), is $0.96$. The middle panel displays the density approximation derived from labeled data generated by solving the reverse-time ODE model, yielding a KL divergence of $3.11 \times 10^{-2}$. The right panel presents the density approximation generated by the high-fidelity model $G^{\rm high}$, which achieves a KL divergence of $3.07 \times 10^{-2}$. For observation $\bm{y}_{|\nu = 0.07}$ (bottom row): The left panel shows the density approximation from $G^{\rm low}$ with a KL divergence of $1.11$ (1000 samples over $[0.06,0.08]$). The middle panel shows the ODE-based approximation with a KL divergence of $1.38\times 10^{-2}$, and the right panel presents the $G^{\rm high}$ approximation with a KL divergence of $1.39\times 10^{-2}$. The results demonstrate that both the ODE-based solution and $G^{\rm high}$ provide significantly more accurate approximations as evidenced by their substantially lower KL divergence values.
  • Figure 4.4: Density approximations for observation $\bm{y}_1$, generated using parameters $(\gamma,\rho) = (\sqrt{7},25)$. The bimodal structure centers at $(-\sqrt{7},25)$ and $(\sqrt{7},25)$, reflecting the symmetry of the Lorenz system where both $(\gamma,\rho)$ and $(-\gamma,\rho)$ produce equivalent dynamics. Top left: Density approximation obtained via the low-fidelity generative model $G^{\rm low}(y,z)$, trained using $81 \times 81$ parameter pairs over domain $D = [-5,5] \times [20,30]$ (6561 total samples), showing clear identification of the bimodal structure. Top right: Density approximation using the labeled data set, demonstrating refined accuracy of the bimodal distribution. Bottom left: High-fidelity density approximation generated by $G^{\rm high}(z)$, exhibiting balanced and well-separated modes, capturing the system's inherent symmetry. Bottom right: MCMC results using 10 walkers within the refined domain with 20,000 burn-in steps, showing less balanced representation of the bimodal structure compared to our proposed method.
  • ...and 6 more figures