Approximately Equivariant Neural Processes

Matthew Ashman; Cristiana Diaconu; Adrian Weller; Wessel Bruinsma; Richard E. Turner

Approximately Equivariant Neural Processes

Matthew Ashman, Cristiana Diaconu, Adrian Weller, Wessel Bruinsma, Richard E. Turner

TL;DR

The use of approximately equivariant architectures in neural processes (NPs), a popular family of meta-learning models, is considered, showing that approximately equivariant NP models can outperform both their non-equivariant and strictly equivariant counterparts.

Abstract

Equivariant deep learning architectures exploit symmetries in learning problems to improve the sample efficiency of neural-network-based models and their ability to generalise. However, when modelling real-world data, learning problems are often not exactly equivariant, but only approximately. For example, when estimating the global temperature field from weather station observations, local topographical features like mountains break translation equivariance. In these scenarios, it is desirable to construct architectures that can flexibly depart from exact equivariance in a data-driven way. Current approaches to achieving this cannot usually be applied out-of-the-box to any architecture and symmetry group. In this paper, we develop a general approach to achieving this using existing equivariant architectures. Our approach is agnostic to both the choice of symmetry group and model architecture, making it widely applicable. We consider the use of approximately equivariant architectures in neural processes (NPs), a popular family of meta-learning models. We demonstrate the effectiveness of our approach on a number of synthetic and real-world regression experiments, showing that approximately equivariant NP models can outperform both their non-equivariant and strictly equivariant counterparts.

Approximately Equivariant Neural Processes

TL;DR

Abstract

Paper Structure (68 sections, 7 theorems, 37 equations, 6 figures, 9 tables, 6 algorithms)

This paper contains 68 sections, 7 theorems, 37 equations, 6 figures, 9 tables, 6 algorithms.

Introduction
Background
Neural Processes
Group Equivariance
Group-Equivariant Conditional Neural Processes
Equivariant Decomposition of Non-Equivariant Functions
Setup.
Approximately Equivariant Neural Processes
Recovering equivariance out-of-distribution.
Related Work
Group equivariance and equivariant neural processes.
Approximate group equivariance.
Experiments
Synthetic 1-D Regression With the Gibbs Kernel
Smoke Plumes
...and 53 more sections

Key Result

Proposition 1

The ground-truth stochastic process $P$ is $G$-stationary and $\pi^{\prime}_P$ is $G$-invariant if, and only if, $\pi_{P}$ is $G$-equivariant.

Figures (6)

Figure 1: A comparison between the predictive distributions on a single synthetic 1-D regression dataset of the TNP-, ConvCNP-, and EquivCNP-based models. For the approximately equivariant models, we plot both the model's predictive distribution (blue), as well as the predictive distributions obtained without using the fixed inputs (red). The dotted black lines indicate the target range.
Figure 2: A comparison between the predictive distributions of the equivariant (left column) and approximately equivariant (middle column) components of the PT-TNP (${\widetilde{T}}$) and EquivCNP (${\widetilde{E}}$) models on a single (cropped) test dataset from the 2-D environmental data experiment.
Figure 3: A comparison between the predictive distributions on a single synthetic 1D regression dataset of the TNP-, ConvCNP-, and EquivCNP-based models with different inductive biases (non-equivariant, equivariant, or approximately equivariant). Unlike in \ref{['fig:gp_regression_plot']}, the context range only spans the low-lengthscale region. For the approximately equivariant models, we plot both the model prediction (blue), as well as the predictions obtained without using the fixed inputs, which results in a strictly equivariant model (red). The approximately equivariant models are the only ones able to correctly capture the uncertainties around the lengthscale change point ($x=0$).
Figure 4: A comparison between the predictive distributions on a single synthetic 1D regression dataset of the TNP-, ConvCNP-, and EquivCNP-based models with different inductive biases (non-equivariant, equivariant, or approximately equivariant). The context range only spans the high-lengthscale region. For the approximately equivariant models, we plot both the model prediction (blue), as well as the predictions obtained without using the fixed inputs, which results in a strictly equivariant model (red). Both the strictly and approximately equivariant models output predictions that closely resemble the ground truth, but the non-equivariant TNP model completely fails to generalise.
Figure 5: Examples of smoke simulations from the smoke plume dataset for six different combinations of smoke radius $r$ and buoyancy $B$. For each such combination, we show the resulting state for all of the three possible x-axis locations. The inputs to our models are randomly sampled $32 \times 32$ patches (indicated in red) from the $128 \times 128$ states.
...and 1 more figures

Theorems & Definitions (14)

Definition 1: $G$-equivariance
Definition 2: $G$-stationary stochastic process
Proposition 1: $G$-stationarity and $G$-equivariance
proof
Definition 3: $G$-equivariant CNP
Theorem 1: Representation of $G$-equivariant CNPs, Theorem 2 by kawano2021group
Proposition 2: Finite-rank approximation of compact operators; e.g., Corollary 6.2 by brezis2011functional.
Theorem 2: Approximation of non-equivariant linear operators.
proof
Theorem 3: Approximation of non-equivariant operators.
...and 4 more

Approximately Equivariant Neural Processes

TL;DR

Abstract

Approximately Equivariant Neural Processes

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (6)

Theorems & Definitions (14)