Sparse Variational Contaminated Noise Gaussian Process Regression with Applications in Geomagnetic Perturbations Forecasting
Daniel Iong, Matthew McAnear, Yuezhou Qu, Shasha Zou, Gabor Toth, Yang Chen
TL;DR
This work addresses robust Gaussian process regression in large-scale data subject to outliers and heteroscedastic noise by introducing a contaminated normal (CN) noise model within the Sparse Variational GP (SVGP) framework. The authors develop a stochastic generalized alternating maximization (SGAM) inference algorithm to jointly estimate CN hyperparameters and GP hyperparameters, achieving scalable inference on big datasets. Through simulations, they show parameter recovery and superior robustness of the CN-SVGPM model compared to Gaussian, Student-t, and Laplace noise models, particularly as outlier prevalence increases. They validate the approach on real-world datasets, including flight delays and ground magnetic perturbations, showing more informative and well-calibrated predictive intervals without sacrificing accuracy relative to neural network baselines. The method offers a practical, scalable solution for reliable uncertainty quantification in domains with frequent outliers, such as space weather forecasting and transportation systems.
Abstract
Gaussian Processes (GP) have become popular machine-learning methods for kernel-based learning on datasets with complicated covariance structures. In this paper, we present a novel extension to the GP framework using a contaminated normal likelihood function to better account for heteroscedastic variance and outlier noise. We propose a scalable inference algorithm based on the Sparse Variational Gaussian Process (SVGP) method for fitting sparse Gaussian process regression models with contaminated normal noise on large datasets. We examine an application to geomagnetic ground perturbations, where the state-of-the-art prediction model is based on neural networks. We show that our approach yields shorter prediction intervals for similar coverage and accuracy when compared to an artificial dense neural network baseline.
