Table of Contents
Fetching ...

Residual Deep Gaussian Processes on Manifolds

Kacper Wyrwal, Andreas Krause, Viacheslav Borovitskiy

TL;DR

The paper develops residual deep Gaussian processes on Riemannian manifolds by embedding manifold-to-manifold layers as Gaussian vector fields plus the exponential map, enabling manifold-valued and vector-valued outputs within a single architecture. By combining three GVF constructions (Projected, Coordinate-frame, and Hodge) with scalable inference techniques—doubly stochastic variational inference and interdomain inducing variables—the approach handles complex, irregular patterns intrinsic to manifold data while maintaining robustness and uncertainty calibration. Empirical results on synthetic benchmarks, geometry-aware Bayesian optimization tasks, wind-field interpolation on the globe, and acceleration tests for Euclidean data show that residual deep GPs can outperform shallow geometry-aware GPs, especially in nonstationary or highly structured settings, and can offer practical speedups under appropriate mappings. The work highlights potential applications in climate modelling, robotics, and beyond, and points to future avenues for optimization and extension to proxy-manifold mappings for Euclidean data.

Abstract

We propose practical deep Gaussian process models on Riemannian manifolds, similar in spirit to residual neural networks. With manifold-to-manifold hidden layers and an arbitrary last layer, they can model manifold- and scalar-valued functions, as well as vector fields. We target data inherently supported on manifolds, which is too complex for shallow Gaussian processes thereon. For example, while the latter perform well on high-altitude wind data, they struggle with the more intricate, nonstationary patterns at low altitudes. Our models significantly improve performance in these settings, enhancing prediction quality and uncertainty calibration, and remain robust to overfitting, reverting to shallow models when additional complexity is unneeded. We further showcase our models on Bayesian optimisation problems on manifolds, using stylised examples motivated by robotics, and obtain substantial improvements in later stages of the optimisation process. Finally, we show our models to have potential for speeding up inference for non-manifold data, when, and if, it can be mapped to a proxy manifold well enough.

Residual Deep Gaussian Processes on Manifolds

TL;DR

The paper develops residual deep Gaussian processes on Riemannian manifolds by embedding manifold-to-manifold layers as Gaussian vector fields plus the exponential map, enabling manifold-valued and vector-valued outputs within a single architecture. By combining three GVF constructions (Projected, Coordinate-frame, and Hodge) with scalable inference techniques—doubly stochastic variational inference and interdomain inducing variables—the approach handles complex, irregular patterns intrinsic to manifold data while maintaining robustness and uncertainty calibration. Empirical results on synthetic benchmarks, geometry-aware Bayesian optimization tasks, wind-field interpolation on the globe, and acceleration tests for Euclidean data show that residual deep GPs can outperform shallow geometry-aware GPs, especially in nonstationary or highly structured settings, and can offer practical speedups under appropriate mappings. The work highlights potential applications in climate modelling, robotics, and beyond, and points to future avenues for optimization and extension to proxy-manifold mappings for Euclidean data.

Abstract

We propose practical deep Gaussian process models on Riemannian manifolds, similar in spirit to residual neural networks. With manifold-to-manifold hidden layers and an arbitrary last layer, they can model manifold- and scalar-valued functions, as well as vector fields. We target data inherently supported on manifolds, which is too complex for shallow Gaussian processes thereon. For example, while the latter perform well on high-altitude wind data, they struggle with the more intricate, nonstationary patterns at low altitudes. Our models significantly improve performance in these settings, enhancing prediction quality and uncertainty calibration, and remain robust to overfitting, reverting to shallow models when additional complexity is unneeded. We further showcase our models on Bayesian optimisation problems on manifolds, using stylised examples motivated by robotics, and obtain substantial improvements in later stages of the optimisation process. Finally, we show our models to have potential for speeding up inference for non-manifold data, when, and if, it can be mapped to a proxy manifold well enough.

Paper Structure

This paper contains 62 sections, 30 equations, 21 figures.

Figures (21)

  • Figure 1: Schematic illustration of a scalar-valued residual deep GP with $L$ hidden layers. The last layer is a scalar-valued GP on the manifold. If it is not present, the model is manifold-valued. If it is replaced with a Gaussian vector field (GVF), the model is a vector field on the manifold.
  • Figure 2: Gaussian vector field constructions on the sphere. In (\ref{['fig:coordinate_frame_gvf']}), orange vectors depict the frame.
  • Figure 3: NLPD of different residual deep GP variants and the baseline model, on the regression problem for the synthetic benchmark function visualised in \ref{['fig:target_function']}. Different subplots correspond to different training set sizes $N$. The solid lines represent the mean, while the shaded areas represent the $\pm 1$ standard deviation region around it. All statistics are computed over $5$ randomised runs.
  • Figure 4: The irregular benchmark function, and Bayesian optimisation performance comparison. The target functions for Bayesian optimisation are: the aforementioned benchmark function, modified to have a single global minimum ($\mathbb{S}_2$ Irregular), and the smooth Ackley function on the $3$-sphere ($\mathbb{S}_3$ Ackley). In (\ref{['fig:bayesian_optimisation-log_regret']}), the solid lines represent the median regret, while the shaded areas around them span $\pm 1$ standard deviation. The statistics are computed over $15$ randomised runs.
  • Figure 5: Using residual deep GPs for probabilistic wind velocity modelling on the surface of Earth.
  • ...and 16 more figures