Residual Deep Gaussian Processes on Manifolds
Kacper Wyrwal, Andreas Krause, Viacheslav Borovitskiy
TL;DR
The paper develops residual deep Gaussian processes on Riemannian manifolds by embedding manifold-to-manifold layers as Gaussian vector fields plus the exponential map, enabling manifold-valued and vector-valued outputs within a single architecture. By combining three GVF constructions (Projected, Coordinate-frame, and Hodge) with scalable inference techniques—doubly stochastic variational inference and interdomain inducing variables—the approach handles complex, irregular patterns intrinsic to manifold data while maintaining robustness and uncertainty calibration. Empirical results on synthetic benchmarks, geometry-aware Bayesian optimization tasks, wind-field interpolation on the globe, and acceleration tests for Euclidean data show that residual deep GPs can outperform shallow geometry-aware GPs, especially in nonstationary or highly structured settings, and can offer practical speedups under appropriate mappings. The work highlights potential applications in climate modelling, robotics, and beyond, and points to future avenues for optimization and extension to proxy-manifold mappings for Euclidean data.
Abstract
We propose practical deep Gaussian process models on Riemannian manifolds, similar in spirit to residual neural networks. With manifold-to-manifold hidden layers and an arbitrary last layer, they can model manifold- and scalar-valued functions, as well as vector fields. We target data inherently supported on manifolds, which is too complex for shallow Gaussian processes thereon. For example, while the latter perform well on high-altitude wind data, they struggle with the more intricate, nonstationary patterns at low altitudes. Our models significantly improve performance in these settings, enhancing prediction quality and uncertainty calibration, and remain robust to overfitting, reverting to shallow models when additional complexity is unneeded. We further showcase our models on Bayesian optimisation problems on manifolds, using stylised examples motivated by robotics, and obtain substantial improvements in later stages of the optimisation process. Finally, we show our models to have potential for speeding up inference for non-manifold data, when, and if, it can be mapped to a proxy manifold well enough.
