Table of Contents
Fetching ...

Efficient reconstruction of multidimensional random field models with heterogeneous data using stochastic neural networks

Mingtao Xia, Qijing Shen

TL;DR

This work analyzes the scalability of Wasserstein-distance training for stochastic neural networks tasked with reconstructing multidimensional random field models under heterogeneous noise. The authors derive refined generalization bounds showing that, unlike the homogeneous-noise case, the convergence rate can become largely dimension-independent when noise is directionally heterogeneous, and they introduce an improved local $W_2$ loss to bolster robustness in sparse data regimes. Through numerical experiments on high-dimensional UQ tasks and a 96-dimensional ODE system, the approach demonstrates accurate reconstruction of multidimensional uncertainty models and robustness to parameter perturbations, often outperforming benchmark generative methods. The results advance scalable, uncertainty-aware learning for complex, high-dimensional random fields and suggest future integrations with physics-informed constraints and entropic regularization for efficiency.

Abstract

In this paper, we analyze the scalability of a recent Wasserstein-distance approach for training stochastic neural networks (SNNs) to reconstruct multidimensional random field models. We prove a generalization error bound for reconstructing multidimensional random field models on training stochastic neural networks with a limited number of training data. Our results indicate that when noise is heterogeneous across dimensions, the convergence rate of the generalization error may not depend explicitly on the model's dimensionality, partially alleviating the "curse of dimensionality" for learning multidimensional random field models from a finite number of data points. Additionally, we improve the previous Wasserstein-distance SNN training approach and showcase the robustness of the SNN. Through numerical experiments on different multidimensional uncertainty quantification tasks, we show that our Wasserstein-distance approach can successfully train stochastic neural networks to learn multidimensional uncertainty models.

Efficient reconstruction of multidimensional random field models with heterogeneous data using stochastic neural networks

TL;DR

This work analyzes the scalability of Wasserstein-distance training for stochastic neural networks tasked with reconstructing multidimensional random field models under heterogeneous noise. The authors derive refined generalization bounds showing that, unlike the homogeneous-noise case, the convergence rate can become largely dimension-independent when noise is directionally heterogeneous, and they introduce an improved local loss to bolster robustness in sparse data regimes. Through numerical experiments on high-dimensional UQ tasks and a 96-dimensional ODE system, the approach demonstrates accurate reconstruction of multidimensional uncertainty models and robustness to parameter perturbations, often outperforming benchmark generative methods. The results advance scalable, uncertainty-aware learning for complex, high-dimensional random fields and suggest future integrations with physics-informed constraints and entropic regularization for efficiency.

Abstract

In this paper, we analyze the scalability of a recent Wasserstein-distance approach for training stochastic neural networks (SNNs) to reconstruct multidimensional random field models. We prove a generalization error bound for reconstructing multidimensional random field models on training stochastic neural networks with a limited number of training data. Our results indicate that when noise is heterogeneous across dimensions, the convergence rate of the generalization error may not depend explicitly on the model's dimensionality, partially alleviating the "curse of dimensionality" for learning multidimensional random field models from a finite number of data points. Additionally, we improve the previous Wasserstein-distance SNN training approach and showcase the robustness of the SNN. Through numerical experiments on different multidimensional uncertainty quantification tasks, we show that our Wasserstein-distance approach can successfully train stochastic neural networks to learn multidimensional uncertainty models.

Paper Structure

This paper contains 7 sections, 5 theorems, 56 equations, 3 figures, 1 table, 1 algorithm.

Key Result

Theorem 2.2

Let $\mu\in\mathcal{P}(\mathbb{R}^d)$. We assume that: where $\|\cdot\|_6$ denotes the $\ell^6$ norm of a vector. We denote $\sigma_i\coloneqq (\int_{\mathbb{R}^d}y_i^6\mu(\hbox{d}\bm{y}))^{\frac{1}{6}}$, and we assume that $\sigma_1\geq\ldots\geq\sigma_d>0$. There exists a constant $C$ depending only on $d$ such that, in the $N\rightarrow\infty$ limit,

Figures (3)

  • Figure 1: The structure of the SNN model used in this work. In the SNN model, for each input $x$, the weights $w_{i, j, k}\sim\mathcal{N}(a_{i, j, k}, \sigma_{i, j, k}^2)$ are independently sampled. ReLU means the ReLU activation function and may be replaced with other activation functions. Such an SNN, with appropriate numbers of widths and heights, has proven to be capable of approximating the random field model Eq. \ref{['model_objective']} up to any accuracy in the $W_2$ metric xia2025generalizedxia2025new. Either the normal feedforward structure or the ResNet he2016deep structure is used for forward propagation. When using the ResNet structure, the additional weights $\tilde{w}_{i, j, k}$ are deterministic.
  • Figure 2: (a)(b)(c) the joint distributions of the ground truth $(y_{\bm{x}, 1}, y_{\bm{x}, 2})$, $(y_{\bm{x}, 1}, y_{\bm{x}, 3})$, and $(y_{\bm{x}, 2}, y_{\bm{x}, 3})$ versus the predicted $(\hat{y}_{\bm{x}, 1}, \hat{y}_{\bm{x}, 2})$, $(\hat{y}_{\bm{x}, 1},\hat{y}_{\bm{x}, 3})$, and $(\hat{y}_{\bm{x}, 2}, \hat{y}_{\bm{x}, 3})$ for 10 different $\bm{x}$ on the testing set (at each $\bm{x}$, there are 20 independently generated $\bm{y}_{\bm{x}}$ and $\hat{\bm{y}}_{\bm{x}}$, respectively). In (a), (b), and (c), $\sigma_j\equiv 0.1$. (d) the runtime and RAM usage w.r.t. the dimensionality $d$ of the random field model Eq. \ref{['example1_model']} (case 1). "ours" refers to directly minimizing the local squared $W_2$ loss function Eq. \ref{['updated_loss']} to train the SNN in Fig. \ref{['fig:snn']}. "VAE" denotes the conditional variational encoder approach, while "flow" denotes the conditional normalization flow method. (e) errors in the mean and SD of the predicted $\hat{\bm{y}}_{\bm{x}}$ w.r.t. the dimensionality $d$ of the random field model Eq. \ref{['example1_model']} (case 2). (f) errors in the mean and SD of the predicted $\hat{\bm{y}}_{\bm{x}}$ w.r.t. the dimensionality $d_0=d$ of the random field model Eq. \ref{['example1_model']} (case 3). (g) errors in the mean and SD of the predicted $\hat{\bm{y}}_{\bm{x}}$ w.r.t. the dimensionality $d_0$ of noise Eq. \ref{['example1_model']} (case 3).
  • Figure 3: (a)(b) the trajectories of the reconstructed position and velocity $\hat{x}_{48}(t)$ and $\hat{v}_{48}(t)$ versus the ground truth position and velocity $x_{48}(t)$ and $v_{48}(t)$ when $d=5, \sigma=1, \sigma_0=0.01$. (c)(d) the trajectories of the reconstructed position and velocity $\hat{x}_{48}(t)$ and $\hat{v}_{48}(t)$ versus the ground truth position and velocity $x_{48}(t)$ and $v_{48}(t)$ when $d=5, \sigma=2.5, \sigma_0=0.01$. (e) errors in $(\hat{\bm{x}}(t), \hat{\bm{v}}(t))^T$ and in $\hat{\bm{f}}$ w.r.t. the dimensionality of uncertain parameters $d$. (f) errors in $(\hat{\bm{x}}(t), \hat{\bm{v}}(t))^T$ and in $\hat{\bm{f}}$ w.r.t. the strength of noise in the dynamics $\sigma$. (g) errors in $(\hat{\bm{x}}(t), \hat{\bm{v}}(t))^T$ and in $\hat{\bm{f}}$ w.r.t. the noise in the initial condition $\sigma_0$.

Theorems & Definitions (9)

  • Definition 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Corollary 2.4
  • Example 3.1
  • Example 3.2
  • Lemma A.1
  • Theorem C.1
  • Proof 1