DeepRV: Accelerating spatiotemporal inference with pre-trained neural priors
Jhonathan Navott, Daniel Jenson, Seth Flaxman, Elizaveta Semenova
TL;DR
DeepRV addresses the cubic scaling of Gaussian processes for spatiotemporal data by learning a decoder-only neural surrogate that maps kernel parameters and a latent draw to GP-like realizations, achieving $O(N^2)$ inference while preserving full probabilistic fidelity. By training to reproduce GP draws and deploying architectures including MLP, gMLP, and transformer with kernel-attention bias, it attains GP-level predictive accuracy and hyperparameter recovery with substantial speedups (up to ~25x) on large datasets. The approach supports non-separable spatiotemporal kernels and city-scale applications (e.g., London LSOA, $n\approx 5{,}000$), functioning as a drop-in GP prior in probabilistic programming frameworks. Ablation shows decoder-only designs and gMLP offer favorable accuracy-efficiency trade-offs, while transformer-based variants extend to variable-location inputs, albeit with higher compute; limitations include pretraining cost and a deterministic emulator assumption, with future work aiming to reduce pretraining time and broaden applicability.
Abstract
Gaussian Processes (GPs) provide a flexible and statistically principled foundation for modelling spatiotemporal phenomena, but their $O(N^3)$ scaling makes them intractable for large datasets. Approximate methods such as variational inference (VI), inducing points (sparse GPs), low-rank factorizations (RFFs), local factorizations and approximations (INLA), improve scalability but trade off accuracy or flexibility. We introduce DeepRV, a neural-network surrogate that closely matches full GP accuracy including hyperparameter estimates, while reducing computational complexity to $O(N^2)$, increasing scalability and inference speed. DeepRV serves as a drop-in replacement for GP prior realisations in e.g. MCMC-based probabilistic programming pipelines, preserving full model flexibility. Across simulated benchmarks, non-separable spatiotemporal GPs, and a real-world application to education deprivation in London (n = 4,994 locations), DeepRV achieves the highest fidelity to exact GPs while substantially accelerating inference. Code is provided in the accompanying ZIP archive, with all experiments run on a single consumer-grade GPU to ensure accessibility for practitioners.
