Scalable and Robust Spatial Prediction via Multi-Resolution Ensembles of Predictive Processes

Nicolas Bianco; Nadja Klein

Scalable and Robust Spatial Prediction via Multi-Resolution Ensembles of Predictive Processes

Nicolas Bianco, Nadja Klein

Abstract

Gaussian processes provide a flexible framework for spatial prediction, but their computational cost limits applicability to large-scale data with large sample size $n$. Predictive processes (PPs), a popular low-rank approximation, mitigate this burden by projecting the original process onto a reduced set of $m\ll n$ inducing points. However, existing theory requires $m$ to grow with $n$, creating a trade-off between accuracy and computational efficiency. We address this challenge by introducing an ensemble of PPs based on spatial partitioning, and propose a novel partitioning and patching scheme with desirable properties. By generalizing the convergence results of PPs, it becomes possible to explicitly balance scalability and accuracy: increasing the number of ensemble components slows down the convergence but substantially improves computational efficiency. We further show theoretically that, despite the limited approximation accuracy of PPs with fixed $m$, they are asymptotically robust to data contamination. Motivated by this insight, we finally introduce a multi-resolution ensemble that combines PPs with fixed $m$ with multiple ensembles defined over possibly overlapping coarse to fine partitions. Simulations and large-scale geostatistical applications demonstrate that our approach delivers accurate, robust predictions with computational gains, providing a practical and broadly applicable solution for spatial prediction.

Scalable and Robust Spatial Prediction via Multi-Resolution Ensembles of Predictive Processes

Abstract

Gaussian processes provide a flexible framework for spatial prediction, but their computational cost limits applicability to large-scale data with large sample size

. Predictive processes (PPs), a popular low-rank approximation, mitigate this burden by projecting the original process onto a reduced set of

inducing points. However, existing theory requires

to grow with

, creating a trade-off between accuracy and computational efficiency. We address this challenge by introducing an ensemble of PPs based on spatial partitioning, and propose a novel partitioning and patching scheme with desirable properties. By generalizing the convergence results of PPs, it becomes possible to explicitly balance scalability and accuracy: increasing the number of ensemble components slows down the convergence but substantially improves computational efficiency. We further show theoretically that, despite the limited approximation accuracy of PPs with fixed

, they are asymptotically robust to data contamination. Motivated by this insight, we finally introduce a multi-resolution ensemble that combines PPs with fixed

with multiple ensembles defined over possibly overlapping coarse to fine partitions. Simulations and large-scale geostatistical applications demonstrate that our approach delivers accurate, robust predictions with computational gains, providing a practical and broadly applicable solution for spatial prediction.

Paper Structure (26 sections, 4 theorems, 18 equations, 5 figures, 5 tables)

This paper contains 26 sections, 4 theorems, 18 equations, 5 figures, 5 tables.

Introduction
Spatial predictive processes
Choosing the inducing points in predictive processes
Limitations of PPs with SPs
Ensemble of predictive processes
Spatial partitioning, local weights, and continuity
Theoretical properties
Theoretical and practical implications.
Multi-resolution ensemble of predictive processes
Robustness of PPs with fixed $m$
Multi-resolution ensemble of PPs
Tuning MREPP
Empirical evaluation
Simulation study
Data generation.
...and 11 more sections

Key Result

Theorem 1

Under Assumptions ass:regularity--ass:partitioning, if $\gamma>2$, the predictive mean $\mu_{K,m}$ of the EPP satisfies:

Figures (5)

Figure 1: Graphical illustration for PP with fixed $m$ (blue box), the EPP (orange boxes for two distinct spatial partitions), and the MREPP (ensemble over EPPs of coarse-to-fine partitions in green box) and their properties investigated in this paper. The PP consists of a single resolution ($L=1$), and a single ensemble member ($K_1=1$). The spatial domain is approximated based on a fixed number $m_1=m$ of inducing points. While the PP is scalable and more robust it has limited prediction accuracy. The EPP in contrast partitions the space into $K$ subregions based on $m$ inducing points each, where each of them forms an ensemble member. This improves prediction accuracy compared to the PP at the price of a lack of robustness. As a solution, we propose MREPP, that leverages both, the PP and EPP by combining inference over $L>1$ resolutions, with a coarse-to-fine structure of spatial domain partitions, where $K_L>K_{L-1}>\ldots>K_1$. At each resolution $l$ for $l=1,\ldots,L$ a corresponding EPP on $K_l$ subregions is fitted. Computations can be parallelized, yielding an approach that is accurate, scalable and robust simultaneously.
Figure 2: Sensitivity of the smoothness parameter $\gamma$. Estimates of the smoothness parameter $\gamma$ are shown for different values of the Matérn parameter $\nu$ and increasing separation radius $r_S$.
Figure 3: Scenarios 1 and 2: Point and probabilistic predictions. The RMSE (top panels) and the LPS (bottom panels) are shown for fixed space (Scenario 1) and enlarging domain with fixed separation radius (Scenario 2) and increasing sample sizes $n=1000,5000,10000$. For both metrics, a smaller value indicates better performance.
Figure 4: Scenario 3: Point and probabilistic predictions. The RMSE (top panels) and logarithm of the LPS (bottom panels) are shown for increasing level of contamination ($x$-axis) and sample sizes $n\in\{5000,7500,10000\}$.
Figure 5: Scenario 3: Estimated resolution weights for MREPP($L=6$). The estimated distribution of weights $p(l)$ in MREPP is shown for increasing level of contamination and sample sizes. Each boxplot is the distribution of $p(l)$ over the replicates for $l=1,\ldots,L$ and for a given level of contamination they are sorted from left $l=1$ (coarsest resolution) to right $l=L$ (finest resolution).

Theorems & Definitions (8)

Definition 1: Ensemble of Predictive Processes
Theorem 1
Corollary 1
Definition 2: Predictive robustness
Proposition 1: GP influence
Proposition 2: PP influence
Remark 1
Definition 3: Multi-Resolution Ensemble of Predictive Processes

Scalable and Robust Spatial Prediction via Multi-Resolution Ensembles of Predictive Processes

Abstract

Scalable and Robust Spatial Prediction via Multi-Resolution Ensembles of Predictive Processes

Authors

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (8)