Weighted Wasserstein Barycenter of Gaussian Processes for exotic Bayesian Optimization tasks
Antonio Candelieri, Francesco Archetti
TL;DR
This work introduces Weighted Wasserstein Barycenter of Gaussian Processes (W2BGP) as a unifying approach to exotic Bayesian Optimization tasks, including collaborative/federated BO, batch BO, and multi-fidelity BO, by tuning a simple weighting scheme over GP posteriors. It leverages the fact that GP predictions are Gaussian at each input, enabling a tractable univariate Gaussian WB and allowing standard acquisition functions such as LCB, PI, and EI to be reformulated under the WB framework. The paper examines multiple weighting strategies across tasks, showing that self-confident weighting often yields the best performance for collaboration, while uncooperative weighting promotes batch diversity, and fidelities as weights are not universally superior for MFBO. Empirical results on diverse test problems demonstrate the framework’s ability to unify these tasks under a single mechanism, with acquisition interpretation and computational gains highlighted. The work also points to future directions including transfer-BO, scalable GPs, and adaptive weighting schemes, and provides open-source code and data for reproducibility.
Abstract
Exploiting the analogy between Gaussian Distributions and Gaussian Processes' posterior, we present how the weighted Wasserstein Barycenter of Gaussian Processes (W2BGP) can be used to unify, under a common framework, different exotic Bayesian Optimization (BO) tasks. Specifically, collaborative/federated BO, (synchronous) batch BO, and multi-fidelity BO are considered in this paper. Our empirical analysis proves that each one of these tasks requires just an appropriate weighting schema for the W2BGP, while the entire framework remains untouched. Moreover, we demonstrate that the most well-known BO acquisition functions can be easily re-interpreted under the proposed framework and also enable a more computationally efficient way to deal with the computation of the Wasserstein Barycenter, compared with state-of-the-art methods from the Machine Learning literature. Finally, research perspectives branching from the proposed approach are presented.
