Table of Contents
Fetching ...

A Generalized Unified Skew-Normal Process with Neural Bayes Inference

Kesen Wang, Marc G. Genton

TL;DR

This work tackles non-Gaussian spatial data exhibiting skewness and heavy tails by introducing the Generalized Unified Skew-Normal (GSUN) process, a flexible extension of SUN that yields vanishing correlations at large distances and a tractable kriging framework. A concise SUN re-parameterization with a diagonal skewness structure improves numerical stability and interpretability, while the GSUN construction integrates latent skewness from both observed and latent processes via two independent Gaussian components. To enable scalable inference in high dimensions, the authors develop neural Bayes estimators based on Graph Attention Networks and an encoder–transformer, trained through amortized simulation under uniform priors, and augmented with dropout and on-the-fly data generation. The approach is validated through extensive simulations, uncertainty quantification, and an application to Pb-contaminated soils, showing favorable PIT behavior and improved fit over Gaussian and Tukey g‑and‑h models, with SUGLG serving as a baseline competitor. Overall, the combination of a flexible GSUN spatial model and neural Bayes inference offers a principled, scalable framework for non-Gaussian spatial analysis with practical interpolation capabilities.

Abstract

In recent decades, statisticians have been increasingly encountering spatial data that exhibit non-Gaussian behaviors such as asymmetry and heavy-tailedness. As a result, the assumptions of symmetry and fixed tail weight in Gaussian processes have become restrictive and may fail to capture the intrinsic properties of the data. To address the limitations of the Gaussian models, a variety of skewed models has been proposed, of which the popularity has grown rapidly. These skewed models introduce parameters that govern skewness and tail weight. Among various proposals in the literature, unified skewed distributions, such as the Unified Skew-Normal (SUN), have received considerable attention. In this work, we revisit a more concise and intepretable re-parameterization of the SUN distribution and apply the distribution to random fields by constructing a generalized unified skew-normal (GSUN) spatial process. We demonstrate that the GSUN is a valid spatial process by showing its vanishing correlation in large distances and provide the corresponding spatial interpolation method. In addition, we develop an inference mechanism for the GSUN process using the concept of neural Bayes estimators with deep graphical attention networks (GATs) and encoder transformer. We show the superiority of our proposed estimator over the conventional CNN-based architectures regarding stability and accuracy by means of a simulation study and application to Pb-contaminated soil data. Furthermore, we show that the GSUN process is different from the conventional Gaussian processes and Tukey g-and-h processes, through the probability integral transform (PIT).

A Generalized Unified Skew-Normal Process with Neural Bayes Inference

TL;DR

This work tackles non-Gaussian spatial data exhibiting skewness and heavy tails by introducing the Generalized Unified Skew-Normal (GSUN) process, a flexible extension of SUN that yields vanishing correlations at large distances and a tractable kriging framework. A concise SUN re-parameterization with a diagonal skewness structure improves numerical stability and interpretability, while the GSUN construction integrates latent skewness from both observed and latent processes via two independent Gaussian components. To enable scalable inference in high dimensions, the authors develop neural Bayes estimators based on Graph Attention Networks and an encoder–transformer, trained through amortized simulation under uniform priors, and augmented with dropout and on-the-fly data generation. The approach is validated through extensive simulations, uncertainty quantification, and an application to Pb-contaminated soils, showing favorable PIT behavior and improved fit over Gaussian and Tukey g‑and‑h models, with SUGLG serving as a baseline competitor. Overall, the combination of a flexible GSUN spatial model and neural Bayes inference offers a principled, scalable framework for non-Gaussian spatial analysis with practical interpolation capabilities.

Abstract

In recent decades, statisticians have been increasingly encountering spatial data that exhibit non-Gaussian behaviors such as asymmetry and heavy-tailedness. As a result, the assumptions of symmetry and fixed tail weight in Gaussian processes have become restrictive and may fail to capture the intrinsic properties of the data. To address the limitations of the Gaussian models, a variety of skewed models has been proposed, of which the popularity has grown rapidly. These skewed models introduce parameters that govern skewness and tail weight. Among various proposals in the literature, unified skewed distributions, such as the Unified Skew-Normal (SUN), have received considerable attention. In this work, we revisit a more concise and intepretable re-parameterization of the SUN distribution and apply the distribution to random fields by constructing a generalized unified skew-normal (GSUN) spatial process. We demonstrate that the GSUN is a valid spatial process by showing its vanishing correlation in large distances and provide the corresponding spatial interpolation method. In addition, we develop an inference mechanism for the GSUN process using the concept of neural Bayes estimators with deep graphical attention networks (GATs) and encoder transformer. We show the superiority of our proposed estimator over the conventional CNN-based architectures regarding stability and accuracy by means of a simulation study and application to Pb-contaminated soil data. Furthermore, we show that the GSUN process is different from the conventional Gaussian processes and Tukey g-and-h processes, through the probability integral transform (PIT).

Paper Structure

This paper contains 23 sections, 4 theorems, 36 equations, 10 figures, 2 algorithms.

Key Result

Proposition 1

If $\mathbf{Y} \sim {\cal SUN}_{d,m}(\bm{\xi},\bm{\Psi},\mathbf{H},\bm{\tau},\Bar{\bm{\Gamma}})$, then

Figures (10)

  • Figure 1: Graphical representation of the spatial data with undirected edges. The circle in black denotes a pre-specified radius $R$ to define the neighboring nodes. The red arrow denotes the radius, $R$.
  • Figure 2: GAT- and Encoder-based neural Bayes estimator for the GSUN spatial process. $n$-heads denotes the number of attention heads for the multi-head attention mechanism. dim demonstrates the output dimension of each GAT layer. $n$-layers represents the number of encoder blocks. $d$ is the dimension of the input embeddings of the encoder block. The numbers in the parenthesis for each FFN layer illustrate its output dimension.
  • Figure 3: Hyper-parameter tuning for the proposed neural Bayes estimator. The red line denotes the threshold $10^{-5}$. The learning rates start as $10^{-3}$ and are manually adjusted (multiplied by 0.1) at $(100,500,1000,3000) \times 10^4$ epochs. The radius $R$ is adapted to the differing numbers of GAT layers such that the aggregations can sufficiently cover the study region within the specified number of layers.
  • Figure 4: The boxplots of the parameter estimates obtained from the proposed neural Bayes estimator (marked in Blue and denoted as GAT-$n$) and the CNN-based Bayes estimator (marked in Green and denoted as CNN-$n$) for $N = 500$ replicates of $n = 400,900, 1600$ realizations of the GSUN process in $[0,1]^2$ simulated from $\bm{\Theta} = (1, 0.15, 1, 0.1, 0.5, 0.55, -0.3)^\top$. The red lines denote the true values.
  • Figure 5: The empirical densities of $\hat{\bm{\Theta}}_{(\mathbf{z}_n,\bm{\beta})}^k$ (marked in Cyan and denoted as E-$n$) and $\hat{\bm{\Theta}}_{(\Tilde{\mathbf{z}}_n,\bm{\beta})}^k$ (marked in Blue and denoted as A-$n$) obtained from the proposed neural Bayes estimator with $N = 500$ replicates of $n = 400,900,1600$ realizations of the GSUN process in $[0,1]^2$ simulated from $\bm{\Theta} = (2, 0.03, 0.5, 0.3, 1, -0.7, 0.5)^\top$ and Algorithm \ref{['alg:uc']}, respectively.
  • ...and 5 more figures

Theorems & Definitions (10)

  • Definition 1
  • proof
  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Proposition 4
  • proof