Transformer Neural Processes: Uncertainty-Aware Meta Learning Via Sequence Modeling
Tung Nguyen, Aditya Grover
TL;DR
Transformer Neural Processes (TNPs) recast uncertainty-aware meta-learning as sequence modeling, using a transformer backbone and an autoregressive objective to predict target values conditioned on context. By enforcing context invariance and target equivariance and offering diagonal and non-diagonal covariance variants, TNPs achieve strong performance across meta-regression, image completion, contextual bandits, and Bayesian optimization without relying on latent-variable ELBOs. The approach yields state-of-the-art results on several benchmarks, highlights efficient covariances (diagonal or Cholesky/low-rank), and demonstrates favorable scalability and calibration properties. This work provides a unified, scalable framework for uncertainty-aware meta-learning with practical impact on sequential decision making and function-learning tasks.
Abstract
Neural Processes (NPs) are a popular class of approaches for meta-learning. Similar to Gaussian Processes (GPs), NPs define distributions over functions and can estimate uncertainty in their predictions. However, unlike GPs, NPs and their variants suffer from underfitting and often have intractable likelihoods, which limit their applications in sequential decision making. We propose Transformer Neural Processes (TNPs), a new member of the NP family that casts uncertainty-aware meta learning as a sequence modeling problem. We learn TNPs via an autoregressive likelihood-based objective and instantiate it with a novel transformer-based architecture. The model architecture respects the inductive biases inherent to the problem structure, such as invariance to the observed data points and equivariance to the unobserved points. We further investigate knobs within the TNP framework that tradeoff expressivity of the decoding distribution with extra computation. Empirically, we show that TNPs achieve state-of-the-art performance on various benchmark problems, outperforming all previous NP variants on meta regression, image completion, contextual multi-armed bandits, and Bayesian optimization.
