Amortized Active Learning for Nonparametric Functions
Cen-You Li, Marc Toussaint, Barbara Rakitsch, Christoph Zimmer
TL;DR
This work tackles efficient active learning for nonparametric regression by introducing an amortized learning framework. A neural policy is trained offline on GP-simulated tasks to propose data labels, enabling zero-shot, real-time AL deployment without iterative GP updates or acquisition optimization. The authors formulate differentiable AL objectives under a GP prior, employ Fourier features to approximate GP samples, and demonstrate that nonmyopic amortized AL matches the accuracy of traditional GP AL while vastly reducing query time across multiple benchmarks and real datasets. The approach delivers practical speedups for low-data regression tasks and showcases robust generalization across 1D and 2D problems with real-world data.
Abstract
Active learning (AL) is a sequential learning scheme aiming to select the most informative data. AL reduces data consumption and avoids the cost of labeling large amounts of data. However, AL trains the model and solves an acquisition optimization for each selection. It becomes expensive when the model training or acquisition optimization is challenging. In this paper, we focus on active nonparametric function learning, where the gold standard Gaussian process (GP) approaches suffer from cubic time complexity. We propose an amortized AL method, where new data are suggested by a neural network which is trained up-front without any real data (Figure 1). Our method avoids repeated model training and requires no acquisition optimization during the AL deployment. We (i) utilize GPs as function priors to construct an AL simulator, (ii) train an AL policy that can zero-shot generalize from simulation to real learning problems of nonparametric functions and (iii) achieve real-time data selection and comparable learning performances to time-consuming baseline methods.
