Bayesian Transfer Learning for Artificially Intelligent Geospatial Systems: A Predictive Stacking Approach
Luca Presicce, Sudipto Banerjee
TL;DR
The paper introduces a scalable GeoAI framework that automates probabilistic spatial inference on massive datasets via double Bayesian predictive stacking (dbps). By partitioning data into streaming subsets and applying conjugate matrix-variate posteriors (MNIW) within each subset, it achieves exact global inference without MCMC; a second stacking step aggregates across subsets while fixing a grid of spatial parameters (α, φ). The approach delivers predictive densities in closed form, supports disagreement tempering to curb variance inflation, and demonstrates comparable predictive performance to full Gaussian processes with substantial computational savings, including an amortized inference pathway through neural networks. Its application to MODIS vegetation indices shows accurate spatial surfaces and uncertainty quantification at global scales, highlighting a practical, automated, and scalable GeoAI solution.
Abstract
Building artificially intelligent geospatial systems requires rapid delivery of spatial data analysis on massive scales with minimal human intervention. Depending upon their intended use, data analysis can also involve model assessment and uncertainty quantification. This article devises transfer learning frameworks for deployment in artificially intelligent systems, where a massive data set is split into smaller data sets that stream into the analytical framework to propagate learning and assimilate inference for the entire data set. Specifically, we introduce Bayesian predictive stacking for multivariate spatial data and demonstrate rapid and automated analysis of massive data sets. Furthermore, inference is delivered without human intervention without excessively demanding hardware settings. We illustrate the effectiveness of our approach through extensive simulation experiments and in producing inference from massive dataset on vegetation index that are indistinguishable from traditional (and more expensive) statistical approaches.
