Table of Contents
Fetching ...

The Connection between Kriging and Large Neural Networks

Marius Marinescu

TL;DR

This paper investigates the relationship between Kriging, Gaussian Process Regression, and neural networks. It demonstrates that Kriging predictions coincide with the MAP estimate in GP regression, and that a single-hidden-layer MLP converges to a Gaussian process in the infinite-width limit with a kernel given by $K(\mathbf{x}, \mathbf{x}') = \mathbb{E}_{\mathbf{a}}[h(\mathbf{x}; \mathbf{a}) h(\mathbf{x}'; \mathbf{a})]$. It catalogs concrete kernels arising from common transfer functions (e.g., linear, squared exponential, arc-cosine) and discusses non-stationarity, while outlining extensions to deeper networks via NNGP kernels and training dynamics via NTK. The work offers a unified probabilistic-kernel viewpoint that blends spatial statistics with kernel methods and deep learning, enabling interpretable, uncertainty-aware, and spatially aware ML models.

Abstract

AI has impacted many disciplines and is nowadays ubiquitous. In particular, spatial statistics is in a pivotal moment where it will increasingly intertwine with AI. In this scenario, a relevant question is what relationship spatial statistics models have with machine learning (ML) models, if any. In particular, in this paper, we explore the connections between Kriging and neural networks. At first glance, they may appear unrelated. Kriging - and its ML counterpart, Gaussian process regression - are grounded in probability theory and stochastic processes, whereas many ML models are extensively considered Black-Box models. Nevertheless, they are strongly related. We study their connections and revisit the relevant literature. The understanding of their relations and the combination of both perspectives may enhance ML techniques by making them more interpretable, reliable, and spatially aware.

The Connection between Kriging and Large Neural Networks

TL;DR

This paper investigates the relationship between Kriging, Gaussian Process Regression, and neural networks. It demonstrates that Kriging predictions coincide with the MAP estimate in GP regression, and that a single-hidden-layer MLP converges to a Gaussian process in the infinite-width limit with a kernel given by . It catalogs concrete kernels arising from common transfer functions (e.g., linear, squared exponential, arc-cosine) and discusses non-stationarity, while outlining extensions to deeper networks via NNGP kernels and training dynamics via NTK. The work offers a unified probabilistic-kernel viewpoint that blends spatial statistics with kernel methods and deep learning, enabling interpretable, uncertainty-aware, and spatially aware ML models.

Abstract

AI has impacted many disciplines and is nowadays ubiquitous. In particular, spatial statistics is in a pivotal moment where it will increasingly intertwine with AI. In this scenario, a relevant question is what relationship spatial statistics models have with machine learning (ML) models, if any. In particular, in this paper, we explore the connections between Kriging and neural networks. At first glance, they may appear unrelated. Kriging - and its ML counterpart, Gaussian process regression - are grounded in probability theory and stochastic processes, whereas many ML models are extensively considered Black-Box models. Nevertheless, they are strongly related. We study their connections and revisit the relevant literature. The understanding of their relations and the combination of both perspectives may enhance ML techniques by making them more interpretable, reliable, and spatially aware.
Paper Structure (10 sections, 16 equations, 2 figures, 1 table)

This paper contains 10 sections, 16 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: A MLP architecture with one hidden layer and a transfer function $h$.
  • Figure 2: Samples from GP and NN. The length scale $\sigma$ is chosen to be 1.