Table of Contents
Fetching ...

Sparse Gaussian Neural Processes

Tommy Rochussen, Vincent Fortuin

TL;DR

Sparse Gaussian Neural Processes (SGNP) address the interpretability and scalability gap in probabilistic meta-learning by meta-learning sparse Gaussian process inference. The framework uses neural-set-function encoders to amortise GP variational parameters across tasks, yielding posteriors that are interpretable through inducing points and GP priors. A collapsed variant (SGNP) recovers closed-form inducing-point posteriors for Gaussian likelihoods, while classification can use a regression-relabelling approach (s-ConvSGNP). Across synthetic and real data, SGNPs outperform standard neural processes, especially when the number of observed tasks is small or when prior GP knowledge is available, demonstrating strong data efficiency and task-specific priors with practical inference speed.

Abstract

Despite significant recent advances in probabilistic meta-learning, it is common for practitioners to avoid using deep learning models due to a comparative lack of interpretability. Instead, many practitioners simply use non-meta-models such as Gaussian processes with interpretable priors, and conduct the tedious procedure of training their model from scratch for each task they encounter. While this is justifiable for tasks with a limited number of data points, the cubic computational cost of exact Gaussian process inference renders this prohibitive when each task has many observations. To remedy this, we introduce a family of models that meta-learn sparse Gaussian process inference. Not only does this enable rapid prediction on new tasks with sparse Gaussian processes, but since our models have clear interpretations as members of the neural process family, it also allows manual elicitation of priors in a neural process for the first time. In meta-learning regimes for which the number of observed tasks is small or for which expert domain knowledge is available, this offers a crucial advantage.

Sparse Gaussian Neural Processes

TL;DR

Sparse Gaussian Neural Processes (SGNP) address the interpretability and scalability gap in probabilistic meta-learning by meta-learning sparse Gaussian process inference. The framework uses neural-set-function encoders to amortise GP variational parameters across tasks, yielding posteriors that are interpretable through inducing points and GP priors. A collapsed variant (SGNP) recovers closed-form inducing-point posteriors for Gaussian likelihoods, while classification can use a regression-relabelling approach (s-ConvSGNP). Across synthetic and real data, SGNPs outperform standard neural processes, especially when the number of observed tasks is small or when prior GP knowledge is available, demonstrating strong data efficiency and task-specific priors with practical inference speed.

Abstract

Despite significant recent advances in probabilistic meta-learning, it is common for practitioners to avoid using deep learning models due to a comparative lack of interpretability. Instead, many practitioners simply use non-meta-models such as Gaussian processes with interpretable priors, and conduct the tedious procedure of training their model from scratch for each task they encounter. While this is justifiable for tasks with a limited number of data points, the cubic computational cost of exact Gaussian process inference renders this prohibitive when each task has many observations. To remedy this, we introduce a family of models that meta-learn sparse Gaussian process inference. Not only does this enable rapid prediction on new tasks with sparse Gaussian processes, but since our models have clear interpretations as members of the neural process family, it also allows manual elicitation of priors in a neural process for the first time. In meta-learning regimes for which the number of observed tasks is small or for which expert domain knowledge is available, this offers a crucial advantage.

Paper Structure

This paper contains 41 sections, 19 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Computational diagrams of the ConvGNP markou2022practical and the SGNP (ours). $\boldsymbol{\mu}_t$ and $\boldsymbol{\Sigma}_t$ denote the mean and covariance matrix that parameterise the multivariate Gaussian predictive distribution over the target outputs, $\mathbf{y}_t$.
  • Figure 2: Model predictions on a synthetic regression dataset. Orange dots represent data points and the black shaded areas represent predictive distribution densities. The meta-models were trained on just five tasks.
  • Figure 3: Model predictions on a synthetic classification dataset. Red and blue dots represent data points corresponding to opposite classes. The blue and red surface represents the predictive distribution, where each colour indicates a high probability of points belonging to that class. The meta-models were trained on just five tasks.
  • Figure 4: Four example datasets generated in the same way as those used in the meta-dataset used in the synthetic 1D regression experiment.
  • Figure 5: Four example datasets generated in the same way as those used in the meta-dataset used in the synthetic 2D classification experiment.
  • ...and 1 more figures