Spatial Bayesian Neural Networks

Andrew Zammit-Mangion; Michael D. Kaminski; Ba-Hien Tran; Maurizio Filippone; Noel Cressie

Spatial Bayesian Neural Networks

Andrew Zammit-Mangion, Michael D. Kaminski, Ba-Hien Tran, Maurizio Filippone, Noel Cressie

TL;DR

The paper introduces Spatial Bayesian Neural Networks (SBNNs), a flexible, calibration-based approach for modelling spatial processes by embedding spatial structure and allowing parameter variation across space. SBNNs are calibrated to a target process using Wasserstein-1 distance, via a two-stage optimization that updates a differentiable surrogate for the objective and then the hyperparameters of the network, enabling accurate replication of a diverse set of spatial processes, including Gaussian, non-Gaussian, and max-stable models. Through simulation studies and case studies, the authors demonstrate that SBNNs with embedding layers and spatially varying parameters outperform vanilla BNNs and can serve as efficient surrogates for complex stochastic processes, while highlighting computational demands and interpretability limitations. The framework provides a versatile tool for spatial prediction and uncertainty quantification, with potential applicability to a wide range of replicated-realization settings and stochastic simulators, albeit not without open questions about predictive inference with spatially varying parameters and covariate integration.

Abstract

interpretable, and well understood models that are routinely employed even though, as is revealed through prior and posterior predictive checks, these can poorly characterise the spatial heterogeneity in the underlying process of interest. Here, we propose a new, flexible class of spatial-process models, which we refer to as spatial Bayesian neural networks (SBNNs). An SBNN leverages the representational capacity of a Bayesian neural network; it is tailored to a spatial setting by incorporating a spatial ``embedding layer'' into the network and, possibly, spatially-varying network parameters. An SBNN is calibrated by matching its finite-dimensional distribution at locations on a fine gridding of space to that of a target process of interest. That process could be easy to simulate from or we may have many realisations from it. We propose several variants of SBNNs, most of which are able to match the finite-dimensional distribution of the target process at the selected grid better than conventional BNNs of similar complexity. We also show that an SBNN can be used to represent a variety of spatial processes often used in practice, such as Gaussian processes, lognormal processes, and max-stable processes. We briefly discuss the tools that could be used to make inference with SBNNs, and we conclude with a discussion of their advantages and limitations.

Spatial Bayesian Neural Networks

TL;DR

Abstract

Paper Structure (22 sections, 26 equations, 21 figures, 4 tables)

This paper contains 22 sections, 26 equations, 21 figures, 4 tables.

Introduction
Methodology
Bayesian neural networks for spatial data
Spatial Bayesian neural networks
The embedding layer in an SBNN
Spatially varying network parameters
Model specification of BNNs and SBNNs
SBNN calibration using the Wasserstein distance
Simulation studies
Calibration to stationary Gaussian spatial process
Calibration to non-stationary Gaussian spatial process
Calibration to stationary lognormal spatial process
Making inference with SBNNs
Case study 1: GP target process
Case study 2: Max-stable target process
...and 7 more sections

Figures (21)

Figure 1: Sample paths (black) of the vanilla Bayesian neural network of the form \ref{['eq:composition']} and \ref{['eq:forwardpass']} with $\textbf{f}_{0}(\textbf{s}; \theta_0) \equiv \textbf{s}$, where here each hidden layer is of dimension $d_l = 40,\, l = 1,\dots,(L-1)$, the activation functions in $\bm{\varphi}_{l-1}(\cdot), l = 1,\dots,L,$ are $\textrm{tanh}(\cdot)$ functions, $D \equiv [-4,4]$, and the weights and biases collected in ${\boldsymbol{\theta}}$ are all drawn independently from a standard normal distribution with zero mean and unit variance. The empirical-process mean computed from 4000 sample paths is also shown (red). (Top-left panel) $L = 1$. (Top-right panel) $L = 2$. (Bottom-left panel) $L = 4$. (Bottom-right panel) $L = 8$.
Figure 2: Schematic of an SBNN with $L = 3$ layers, and with spatially-varying parameters (SBNN-V). The SBNN-V contains an embedding layer and skip connections that feed the output of the embedding layer into each subsequent layer of the network.
Figure 3: Empirical covariogram of the (S)BNN at different stages during the optimisation (different line-styles denote the empirical covariogram after 100, 200, 400, 2000, and 4000 gradient steps, respectively), and the true covariogram of the target Gaussian process (red). Left: BNN-IL. Right: SBNN-IL.
Figure 4: (Top-left panel) Covariance between the target process (stationary GP) at 16 grid points (crosses), with coordinates as indicated by the labels at the top and the right of the sub-panels, and the target process on a fine gridding $(64 \times 64)$ of $D = [-4, 4] \times [-4, 4]$. (Bottom-left panel) Eight realisations of the underlying target process on the same fine gridding of $D$. (Right panels) Same as left panels but for the calibrated SBNN-IL.
Figure 5: Kernel density plots from 1000 samples of the SBNN-IL and the target process (stationary GP). (Top panel) Univariate densities of the two processes at eight spatial locations arranged on a $2 \times 4$ grid in $D$ (with coordinates as indicated by the labels of the sub-panels). (Bottom panel) Overlayed bivariate densities of the two processes at $\tilde{\textbf{s}}_0 = (-1.33, -0.06)'$ and three other locations in $D$ (with coordinates as indicated by the labels of the sub-panels).
...and 16 more figures

Spatial Bayesian Neural Networks

TL;DR

Abstract

Spatial Bayesian Neural Networks

Authors

TL;DR

Abstract

Table of Contents

Figures (21)