Table of Contents
Fetching ...

Data-driven Prior Learning for Bayesian Optimisation

Sigrid Passano Hellan, Christopher G. Lucas, Nigel H. Goddard

TL;DR

It is shown that PLeBO and prior transfer find good inputs in fewer evaluations, and by learning priors for the hyperparameters of the Gaussian process surrogate model the authors can better approximate the underlying function, especially for few function evaluations.

Abstract

Transfer learning for Bayesian optimisation has generally assumed a strong similarity between optimisation tasks, with at least a subset having similar optimal inputs. This assumption can reduce computational costs, but it is violated in a wide range of optimisation problems where transfer learning may nonetheless be useful. We replace this assumption with a weaker one only requiring the shape of the optimisation landscape to be similar, and analyse the recent method Prior Learning for Bayesian Optimisation - PLeBO - in this setting. By learning priors for the hyperparameters of the Gaussian process surrogate model we can better approximate the underlying function, especially for few function evaluations. We validate the learned priors and compare to a breadth of transfer learning approaches, using synthetic data and a recent air pollution optimisation problem as benchmarks. We show that PLeBO and prior transfer find good inputs in fewer evaluations.

Data-driven Prior Learning for Bayesian Optimisation

TL;DR

It is shown that PLeBO and prior transfer find good inputs in fewer evaluations, and by learning priors for the hyperparameters of the Gaussian process surrogate model the authors can better approximate the underlying function, especially for few function evaluations.

Abstract

Transfer learning for Bayesian optimisation has generally assumed a strong similarity between optimisation tasks, with at least a subset having similar optimal inputs. This assumption can reduce computational costs, but it is violated in a wide range of optimisation problems where transfer learning may nonetheless be useful. We replace this assumption with a weaker one only requiring the shape of the optimisation landscape to be similar, and analyse the recent method Prior Learning for Bayesian Optimisation - PLeBO - in this setting. By learning priors for the hyperparameters of the Gaussian process surrogate model we can better approximate the underlying function, especially for few function evaluations. We validate the learned priors and compare to a breadth of transfer learning approaches, using synthetic data and a recent air pollution optimisation problem as benchmarks. We show that PLeBO and prior transfer find good inputs in fewer evaluations.
Paper Structure (10 sections, 1 equation, 5 figures, 2 tables, 1 algorithm)

This paper contains 10 sections, 1 equation, 5 figures, 2 tables, 1 algorithm.

Figures (5)

  • Figure 1: Comparison of direct and prior transfer. In the top row the past and new tasks share optima but have different shapes. In the bottom row they instead have the same shape. Direct transfer works well for shared optima, prior transfer for shared shapes.
  • Figure 2: Hierarchical structure and inference in PLeBO.
  • Figure 3: Example optimisation tasks. The left column shows the standard assumption that optima are near each other. Stars indicate optima and blue crosses missing values.
  • Figure 4: Performance compared to PLeBO at same iteration (above zero is improvement), mean $\pm$ one standard error. $J$ is the number of test tasks. The top row compares PLeBO to no transfer, the middle to direct transfer and the bottom row to prior transfer methods.
  • Figure 5: Comparing true and inferred hyperparameter priors for the synthetic benchmark.