Robust Gaussian Processes via Relevance Pursuit
Sebastian Ament, Elizabeth Santorella, David Eriksson, Ben Letham, Maximilian Balandat, Eytan Bakshy
TL;DR
This work addresses robustness of Gaussian Processes to sparse label corruptions by introducing Robust Gaussian Processes via Relevance Pursuit (RRP), which learns data-point-specific noise variances ρ and uses a greedy relevance-pursuit strategy to identify a sparse set of outliers. A key theoretical contribution is a convex reparameterization ρ(s) that yields strong convexity and smoothness of the negative marginal log-likelihood, enabling approximation guarantees for the subset-selection process through generalized orthogonal matching pursuit. The framework supports automatic outlier detection via Bayesian model selection and remains compatible with arbitrary kernels and mean functions, providing competitive performance in both regression and Bayesian optimization under sparse corruptions. Empirically, RRP demonstrates robustness to diverse corruption regimes (constant, uniform, asymmetric, focused) and offers favorable computation times relative to heavy-tailed alternatives, while delivering principled uncertainty estimates. Overall, the method advances robust GP learning with theoretical guarantees and practical applicability to BO and related tasks where data integrity is imperfect.
Abstract
Gaussian processes (GPs) are non-parametric probabilistic regression models that are popular due to their flexibility, data efficiency, and well-calibrated uncertainty estimates. However, standard GP models assume homoskedastic Gaussian noise, while many real-world applications are subject to non-Gaussian corruptions. Variants of GPs that are more robust to alternative noise models have been proposed, and entail significant trade-offs between accuracy and robustness, and between computational requirements and theoretical guarantees. In this work, we propose and study a GP model that achieves robustness against sparse outliers by inferring data-point-specific noise levels with a sequential selection procedure maximizing the log marginal likelihood that we refer to as relevance pursuit. We show, surprisingly, that the model can be parameterized such that the associated log marginal likelihood is strongly concave in the data-point-specific noise variances, a property rarely found in either robust regression objectives or GP marginal likelihoods. This in turn implies the weak submodularity of the corresponding subset selection problem, and thereby proves approximation guarantees for the proposed algorithm. We compare the model's performance relative to other approaches on diverse regression and Bayesian optimization tasks, including the challenging but common setting of sparse corruptions of the labels within or close to the function range.
