Table of Contents
Fetching ...

Batch Active Learning in Gaussian Process Regression using Derivatives

Hon Sum Alec Yu, Christoph Zimmer, Duy Nguyen-Tuong

TL;DR

This work investigates the use of derivative information for Batch Active Learning in Gaussian Process regression models and employs the predictive covariance matrix for selection of data batches to exploit full correlation of samples.

Abstract

We investigate the use of derivative information for Batch Active Learning in Gaussian Process regression models. The proposed approach employs the predictive covariance matrix for selection of data batches to exploit full correlation of samples. We theoretically analyse our proposed algorithm taking different optimality criteria into consideration and provide empirical comparisons highlighting the advantage of incorporating derivatives information. Our results show the effectiveness of our approach across diverse applications.

Batch Active Learning in Gaussian Process Regression using Derivatives

TL;DR

This work investigates the use of derivative information for Batch Active Learning in Gaussian Process regression models and employs the predictive covariance matrix for selection of data batches to exploit full correlation of samples.

Abstract

We investigate the use of derivative information for Batch Active Learning in Gaussian Process regression models. The proposed approach employs the predictive covariance matrix for selection of data batches to exploit full correlation of samples. We theoretically analyse our proposed algorithm taking different optimality criteria into consideration and provide empirical comparisons highlighting the advantage of incorporating derivatives information. Our results show the effectiveness of our approach across diverse applications.
Paper Structure (13 sections, 4 theorems, 15 equations, 7 figures, 1 algorithm)

This paper contains 13 sections, 4 theorems, 15 equations, 7 figures, 1 algorithm.

Key Result

proposition thmcounterproposition

Given the same set of points $\mathbf{x}_{1:n}$, the Information Gain under BALGPD is no less than BALGP.

Figures (7)

  • Figure 1: A typical exploration behaviour under BALGPD (red) and BALGP (blue) for the 2D-cardinal sine function. The diagrams, from left to right, present how the two GP models attain after 8, 18 and 28 points are explored. This sample is taken under A-optimality but the diagrams are similar for other optimality criteria.
  • Figure 2: Simulated function: The three diagrams present the RMSE attained under BALGPD (red), BALGP (blue) and no BAL (yellow), as more points are explored. Results are based on D-(left), A-(middle) and E-optimality (right) respectively. The experiment was repeated 30 times.
  • Figure 3: A demonstration of a High Pressure Fuel Injection System. At step $i$, the Rail-Pressure (output) $\psi_i$ is given by the Actuation $u_i$ and Engine Speed $\upsilon_i$ at current time-step $i$ and some steps before.
  • Figure 4: High Pressure Fuel Supply System: The three diagrams plot the RMSE attained under BALGPD (red), BALGP (blue) and random exploration (yellow) as more points are explored, for all D-(left), A-(middle) and E-optimality (right). The experiment was repeated 40 times. Further discussion in suppplementary materials.
  • Figure 5: Ground truth of the sub-data, from https://shop.swisstopo.admin.ch/en/.
  • ...and 2 more figures

Theorems & Definitions (11)

  • definition thmcounterdefinition
  • remark thmcounterremark
  • proposition thmcounterproposition
  • proof : Sketch
  • proposition thmcounterproposition
  • proof : Sketch.
  • remark thmcounterremark
  • theorem thmcountertheorem
  • proof : Sketch
  • theorem thmcountertheorem
  • ...and 1 more