Provable Accelerated Bayesian Optimization with Knowledge Transfer

Haitao Lin; Boxin Zhao; Mladen Kolar; Chong Liu

Provable Accelerated Bayesian Optimization with Knowledge Transfer

Haitao Lin, Boxin Zhao, Mladen Kolar, Chong Liu

TL;DR

DeltaBO addresses accelerating Bayesian optimization on a target task by transferring knowledge from related source tasks. It models the target as f = g + δ where g and δ come from independent Gaussian processes and uses the source posterior to form unbiased, noisy observations of δ, yielding an acquisition with improved uncertainty quantification. Theoretical guarantees show a regret bound of $\tilde{O}(\sqrt{T(T/N + γ_δ)})$, highlighting gains when the source data is plentiful ($N \gg T$) and the difference function is easier to learn ($γ_δ \ll γ_f$). Empirical results on real AutoML and synthetic benchmarks confirm DeltaBO outperforms baselines, validating both the theory and its practical potential for transfer-enabled BO.

Abstract

We study how Bayesian optimization (BO) can be accelerated on a target task with historical knowledge transferred from related source tasks. Existing works on BO with knowledge transfer either do not have theoretical guarantees or achieve the same regret as BO in the non-transfer setting, $\tilde{\mathcal{O}}(\sqrt{T γ_f})$, where $T$ is the number of evaluations of the target function and $γ_f$ denotes its information gain. In this paper, we propose the DeltaBO algorithm, in which a novel uncertainty-quantification approach is built on the difference function $δ$ between the source and target functions, which are allowed to belong to different reproducing kernel Hilbert spaces (RKHSs). Under mild assumptions, we prove that the regret of DeltaBO is of order $\tilde{\mathcal{O}}(\sqrt{T (T/N + γ_δ)})$, where $N$ denotes the number of evaluations from source tasks and typically $N \gg T$. In many applications, source and target tasks are similar, which implies that $γ_δ$ can be much smaller than $γ_f$. Empirical studies on both real-world hyperparameter tuning tasks and synthetic functions show that DeltaBO outperforms other baseline methods and support our theoretical claims.

Provable Accelerated Bayesian Optimization with Knowledge Transfer

TL;DR

, highlighting gains when the source data is plentiful (

) and the difference function is easier to learn (

). Empirical results on real AutoML and synthetic benchmarks confirm DeltaBO outperforms baselines, validating both the theory and its practical potential for transfer-enabled BO.

Abstract

, where

is the number of evaluations of the target function and

denotes its information gain. In this paper, we propose the DeltaBO algorithm, in which a novel uncertainty-quantification approach is built on the difference function

between the source and target functions, which are allowed to belong to different reproducing kernel Hilbert spaces (RKHSs). Under mild assumptions, we prove that the regret of DeltaBO is of order

, where

denotes the number of evaluations from source tasks and typically

. In many applications, source and target tasks are similar, which implies that

can be much smaller than

. Empirical studies on both real-world hyperparameter tuning tasks and synthetic functions show that DeltaBO outperforms other baseline methods and support our theoretical claims.

Provable Accelerated Bayesian Optimization with Knowledge Transfer

TL;DR

Abstract

Provable Accelerated Bayesian Optimization with Knowledge Transfer

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (14)