Table of Contents
Fetching ...

Collaborative Bayesian Optimization via Wasserstein Barycenters

Donglin Zhan, Haoting Zhang, Rhonda Righter, Zeyu Zheng, James Anderson

TL;DR

This work addresses black-box optimization under data privacy by enabling collaboration among $N$ agents who share GP surrogates with a central server. The central model is constructed as a Wasserstein barycenter of local GPs, preserving privacy while providing explicit uncertainty and enabling a collaborative acquisition called Collaborative Knowledge Gradient (Co-KG) that blends central and local guidance. The authors prove asymptotic consistency of Co-KG and its MC-based approximation, and demonstrate through experiments that Co-KG outperforms non-collaborative baselines and is competitive with privacy-unrestricted centralized approaches. The framework shows practical promise for privacy-conscious distributed optimization in engineering and ML settings, with guidance on hyperparameters and discretization trade-offs.

Abstract

Motivated by the growing need for black-box optimization and data privacy, we introduce a collaborative Bayesian optimization (BO) framework that addresses both of these challenges. In this framework agents work collaboratively to optimize a function they only have oracle access to. In order to mitigate against communication and privacy constraints, agents are not allowed to share their data but can share their Gaussian process (GP) surrogate models. To enable collaboration under these constraints, we construct a central model to approximate the objective function by leveraging the concept of Wasserstein barycenters of GPs. This central model integrates the shared models without accessing the underlying data. A key aspect of our approach is a collaborative acquisition function that balances exploration and exploitation, allowing for the optimization of decision variables collaboratively in each iteration. We prove that our proposed algorithm is asymptotically consistent and that its implementation via Monte Carlo methods is numerically accurate. Through numerical experiments, we demonstrate that our approach outperforms other baseline collaborative frameworks and is competitive with centralized approaches that do not consider data privacy.

Collaborative Bayesian Optimization via Wasserstein Barycenters

TL;DR

This work addresses black-box optimization under data privacy by enabling collaboration among agents who share GP surrogates with a central server. The central model is constructed as a Wasserstein barycenter of local GPs, preserving privacy while providing explicit uncertainty and enabling a collaborative acquisition called Collaborative Knowledge Gradient (Co-KG) that blends central and local guidance. The authors prove asymptotic consistency of Co-KG and its MC-based approximation, and demonstrate through experiments that Co-KG outperforms non-collaborative baselines and is competitive with privacy-unrestricted centralized approaches. The framework shows practical promise for privacy-conscious distributed optimization in engineering and ML settings, with guidance on hyperparameters and discretization trade-offs.

Abstract

Motivated by the growing need for black-box optimization and data privacy, we introduce a collaborative Bayesian optimization (BO) framework that addresses both of these challenges. In this framework agents work collaboratively to optimize a function they only have oracle access to. In order to mitigate against communication and privacy constraints, agents are not allowed to share their data but can share their Gaussian process (GP) surrogate models. To enable collaboration under these constraints, we construct a central model to approximate the objective function by leveraging the concept of Wasserstein barycenters of GPs. This central model integrates the shared models without accessing the underlying data. A key aspect of our approach is a collaborative acquisition function that balances exploration and exploitation, allowing for the optimization of decision variables collaboratively in each iteration. We prove that our proposed algorithm is asymptotically consistent and that its implementation via Monte Carlo methods is numerically accurate. Through numerical experiments, we demonstrate that our approach outperforms other baseline collaborative frameworks and is competitive with centralized approaches that do not consider data privacy.

Paper Structure

This paper contains 14 sections, 8 theorems, 63 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Proposition 1

Let $\left\{\tilde{f}_n\right\}_{i=1}^N$ be a set of GPs with $\tilde{f}_n \sim \mathcal{G P}\left(\tilde{\mu}_n\left(x\right),\tilde{K}_n\left(x,x'\right)\right)$. There exists a unique barycenter $f^c \sim \mathcal{GP} \left(\mu^c(x), K^c\left(x,x'\right)\right)$ defined as in (eq.c1). If $f^c$ is and where $\Phi_{K}$ denotes the operator that is associated with the kernel function $K\left(x,x'

Figures (8)

  • Figure 1: Optimal value differences with different acquisition functions on the black-box objective function $f_1(x)$.
  • Figure 2: Optimal value differences with different acquisition functions on the black-box objective function $f_2(x)$.
  • Figure 3: Optimal value difference in iterations with different selections of $\beta_t$ on $f_{1}(x)$.
  • Figure 4: Optimal value difference in iterations with different selections of $\beta_t$ on $f_{2}(x)$.
  • Figure 5: Loss function values in iterations with compared BO approaches on Breast Cancer Dataset.
  • ...and 3 more figures

Theorems & Definitions (12)

  • Proposition 1: mallasto2017learning
  • Theorem 2
  • Theorem 3
  • proof
  • Proposition 4: Proposition 1 of ding2022knowledge
  • Lemma 5: Lemma 6 of ding2022knowledge
  • Proposition 6: Proposition 2.9 in 10.3150/18-BEJ1074
  • Lemma 7
  • proof
  • Lemma 8
  • ...and 2 more