Table of Contents
Fetching ...

Conditional Pseudo-Reversible Normalizing Flow for Surrogate Modeling in Quantifying Uncertainty Propagation

Minglei Yang, Pengjun Wang, Ming Fan, Dan Lu, Yanzhao Cao, Guannan Zhang

TL;DR

This work introduces a conditional pseudo-reversible normalizing flow (PR-NF) as a data-driven surrogate for directly learning conditional PDFs $p(\mathbf{Y}|\mathbf{X})$ and $p(\mathbf{X}|\mathbf{Y})$ in uncertain physical models of the form $\mathbf{Y}=\mathbf{f}(\mathbf{X})+\boldsymbol{\varepsilon}(\mathbf{X})$. The PR-NF uses a forward generator $\mathbf{Y}|\mathbf{X}$ via $\mathbf{G}(\mathbf{Z},\mathbf{X})$ and a pseudo-reversible architecture built from two independent neural nets $\mathbf{h}$ and $\mathbf{g}$ with a soft reversibility constraint, enabling sampling from conditional PDFs without explicit noise models. The paper provides a rigorous convergence analysis in terms of the loss functions $\mathcal{L}_1,\mathcal{L}_2$ and the KL divergence, including continuous-form and Monte Carlo error bounds, showing that with sufficient training the conditional distribution estimates converge to the true ones. Numerical experiments cover forward density learning, bimodal inverse distributions, high-dimensional regression, and a geologic carbon storage application, where PR-NF achieves accurate, scalable uncertainty quantification and fast forecasts. This approach offers a practical, GPU-accelerated, data-driven alternative for efficient forward/inverse UQ in complex systems.

Abstract

We introduce a conditional pseudo-reversible normalizing flow for constructing surrogate models of a physical model polluted by additive noise to efficiently quantify forward and inverse uncertainty propagation. Existing surrogate modeling approaches usually focus on approximating the deterministic component of physical model. However, this strategy necessitates knowledge of noise and resorts to auxiliary sampling methods for quantifying inverse uncertainty propagation. In this work, we develop the conditional pseudo-reversible normalizing flow model to directly learn and efficiently generate samples from the conditional probability density functions. The training process utilizes dataset consisting of input-output pairs without requiring prior knowledge about the noise and the function. Our model, once trained, can generate samples from any conditional probability density functions whose high probability regions are covered by the training set. Moreover, the pseudo-reversibility feature allows for the use of fully-connected neural network architectures, which simplifies the implementation and enables theoretical analysis. We provide a rigorous convergence analysis of the conditional pseudo-reversible normalizing flow model, showing its ability to converge to the target conditional probability density function using the Kullback-Leibler divergence. To demonstrate the effectiveness of our method, we apply it to several benchmark tests and a real-world geologic carbon storage problem.

Conditional Pseudo-Reversible Normalizing Flow for Surrogate Modeling in Quantifying Uncertainty Propagation

TL;DR

This work introduces a conditional pseudo-reversible normalizing flow (PR-NF) as a data-driven surrogate for directly learning conditional PDFs and in uncertain physical models of the form . The PR-NF uses a forward generator via and a pseudo-reversible architecture built from two independent neural nets and with a soft reversibility constraint, enabling sampling from conditional PDFs without explicit noise models. The paper provides a rigorous convergence analysis in terms of the loss functions and the KL divergence, including continuous-form and Monte Carlo error bounds, showing that with sufficient training the conditional distribution estimates converge to the true ones. Numerical experiments cover forward density learning, bimodal inverse distributions, high-dimensional regression, and a geologic carbon storage application, where PR-NF achieves accurate, scalable uncertainty quantification and fast forecasts. This approach offers a practical, GPU-accelerated, data-driven alternative for efficient forward/inverse UQ in complex systems.

Abstract

We introduce a conditional pseudo-reversible normalizing flow for constructing surrogate models of a physical model polluted by additive noise to efficiently quantify forward and inverse uncertainty propagation. Existing surrogate modeling approaches usually focus on approximating the deterministic component of physical model. However, this strategy necessitates knowledge of noise and resorts to auxiliary sampling methods for quantifying inverse uncertainty propagation. In this work, we develop the conditional pseudo-reversible normalizing flow model to directly learn and efficiently generate samples from the conditional probability density functions. The training process utilizes dataset consisting of input-output pairs without requiring prior knowledge about the noise and the function. Our model, once trained, can generate samples from any conditional probability density functions whose high probability regions are covered by the training set. Moreover, the pseudo-reversibility feature allows for the use of fully-connected neural network architectures, which simplifies the implementation and enables theoretical analysis. We provide a rigorous convergence analysis of the conditional pseudo-reversible normalizing flow model, showing its ability to converge to the target conditional probability density function using the Kullback-Leibler divergence. To demonstrate the effectiveness of our method, we apply it to several benchmark tests and a real-world geologic carbon storage problem.
Paper Structure (18 sections, 5 theorems, 50 equations, 14 figures, 1 table)

This paper contains 18 sections, 5 theorems, 50 equations, 14 figures, 1 table.

Key Result

Lemma 4.3

\newlabellem10 Under the Assumptions ass1, ass2. For an arbitrarily small $\varepsilon>0$, there exist two independent single-hidden-layer neural networks $\bm h$ and $\bm g$ such that

Figures (14)

  • Figure 1: The architecture of the proposed conditional pseudo-reversible normalizing flow (PR-NF) model. The novelty is that the input vector $\bm x$ of the physics model in Eq. \ref{['eq:problem']} is incorporated into the hidden neurons of $\bm h(\cdot)$ and $\bm g(\cdot)$. In other words, both $\bm h(\cdot)$ and $\bm g(\cdot)$ are parameterized by $\bm x$, which is the main reason why the transport maps can learn the conditional PDF $p(\bm y|\bm x)$. The pseudo-reversibility $\bm g \approx {\bm h}^{-1}$ is enforced as a soft constraint in the loss function.
  • Figure 1: The Kullback–Leibler (KL) divergence between the ground-truth density $p({ y} | { x})$ and the approximation $p(\widehat{ y} | {x})$ from the PR-NF model for any $x\in (-1,2)$. The top row is for the case $f({ x}) = \sin{(2\pi { x})}$ with four different additive noises and the bottom row corresponds to $f({ x}) = 4({ x}-0.5)^2$. The figure shows that the PR-NF model performs well in terms of KL-divergence within its training range, specifically for $x\in (0,1)$. It is noted that, when input $x$ falls outside this domain, the KL-divergence does increase, which means the PR-NF model does not have the prediction property for adapting beyond its training domain.
  • Figure 2: The computational cost (wall-clock time) of both Pytorch Backpropagation and the calculation of the Jacobian determinant with 5000 samples, comparing CUDA GPU (blue) and CPU (red). The computational time is obtained by running our code on a workstation with Nvidia RTX A5000 GPU. It shows that GPU is effective in accelerating the PR-NF model, particularly in the case of backpropagation.
  • Figure 2: The accuracy performance of the well-trained PR-NF model on evaluating $f(x) = \sin{2\pi x}$ with four different additive noises. Each row represents different noise and each column corresponds to different point $x = -0.8, 0.2, 0.8, 1.8$ (left to right). Consistent with the conclusion from Fig. \ref{['fig_test1_kl']}, very good agreements are observed for inside point $x \in D$ (2nd & 3rd columns). However, it does not work for points outside of domain $D$ (1st & 4th columns), which means the PR-NF model does not have the prediction property for adapting beyond its initial training domain.
  • Figure 3: The accuracy performance of the well-trained PR-NF model on evaluating $f(x) = 4(x-0.5)^2$ with four different additive noises. Each row represents different noise and each column corresponds to different point $x = -0.8, 0.2, 0.8, 1.8$ (left to right). Consistent with the conclusion from Fig. \ref{['fig_test1_kl']}, very good agreements are observed for inside point $x \in D$ (2nd & 3rd columns).
  • ...and 9 more figures

Theorems & Definitions (9)

  • Lemma 4.3: Theorems 4.6 & 4.7 in yang2023pseudo
  • Lemma 4.4
  • Proof 1
  • Lemma 4.5
  • Proof 2
  • Theorem 4.6
  • Proof 3
  • Theorem 4.8
  • Proof 4