Table of Contents
Fetching ...

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

Anthony Bardou, Patrick Thiran, Thomas Begin

TL;DR

This paper tackles high-dimensional Bayesian Optimization by relaxing the common low Maximum Factor Size (MFS) constraint in additive decompositions. It introduces DuMBO, a decentralized, message-passing BO algorithm that models f as a sum of independent factor GPs over a factor graph and uses a tighter decentralized GP-UCB–style acquisition via ADMM to enforce consistency. Theoretical results establish asymptotic no-regret performance, with a high-probability immediate regret bound and a KL-based convergence argument. Empirically, DuMBO matches or surpasses state-of-the-art methods across synthetic and real-world tasks, especially when the true objective has large MFS or unknown additive structure, demonstrating strong scalability and practical impact for high-dimensional, expensive optimization problems.

Abstract

Bayesian Optimization (BO) is typically used to optimize an unknown function $f$ that is noisy and costly to evaluate, by exploiting an acquisition function that must be maximized at each optimization step. Even if provably asymptotically optimal BO algorithms are efficient at optimizing low-dimensional functions, scaling them to high-dimensional spaces remains an open problem, often tackled by assuming an additive structure for $f$. By doing so, BO algorithms typically introduce additional restrictive assumptions on the additive structure that reduce their applicability domain. This paper contains two main contributions: (i) we relax the restrictive assumptions on the additive structure of $f$ without weakening the maximization guarantees of the acquisition function, and (ii) we address the over-exploration problem for decentralized BO algorithms. To these ends, we propose DuMBO, an asymptotically optimal decentralized BO algorithm that achieves very competitive performance against state-of-the-art BO algorithms, especially when the additive structure of $f$ comprises high-dimensional factors.

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

TL;DR

This paper tackles high-dimensional Bayesian Optimization by relaxing the common low Maximum Factor Size (MFS) constraint in additive decompositions. It introduces DuMBO, a decentralized, message-passing BO algorithm that models f as a sum of independent factor GPs over a factor graph and uses a tighter decentralized GP-UCB–style acquisition via ADMM to enforce consistency. Theoretical results establish asymptotic no-regret performance, with a high-probability immediate regret bound and a KL-based convergence argument. Empirically, DuMBO matches or surpasses state-of-the-art methods across synthetic and real-world tasks, especially when the true objective has large MFS or unknown additive structure, demonstrating strong scalability and practical impact for high-dimensional, expensive optimization problems.

Abstract

Bayesian Optimization (BO) is typically used to optimize an unknown function that is noisy and costly to evaluate, by exploiting an acquisition function that must be maximized at each optimization step. Even if provably asymptotically optimal BO algorithms are efficient at optimizing low-dimensional functions, scaling them to high-dimensional spaces remains an open problem, often tackled by assuming an additive structure for . By doing so, BO algorithms typically introduce additional restrictive assumptions on the additive structure that reduce their applicability domain. This paper contains two main contributions: (i) we relax the restrictive assumptions on the additive structure of without weakening the maximization guarantees of the acquisition function, and (ii) we address the over-exploration problem for decentralized BO algorithms. To these ends, we propose DuMBO, an asymptotically optimal decentralized BO algorithm that achieves very competitive performance against state-of-the-art BO algorithms, especially when the additive structure of comprises high-dimensional factors.
Paper Structure (34 sections, 10 theorems, 57 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 34 sections, 10 theorems, 57 equations, 7 figures, 2 tables, 1 algorithm.

Key Result

Proposition 3.4

Let $\mu^{(i)}_{t+1}(\bm x_{\mathcal{V}_i})$ and $(\sigma_{t+1}^{(i)}(\bm x_{\mathcal{V}_i}))^2$ be the posterior mean and variance of $f^{(i)}$ at input $\bm x_{\mathcal{V}_i}$, respectively. Then, for the decomposition (eq:dec), with $t \times 1$ vectors $\bm k_{\bm x_{\mathcal{V}_i}}^{(i)} = (k^{(i)}(\bm x_{\mathcal{V}_i}, \bm x^j_{\mathcal{V}_i}))_{j \in \llbracket1, t\rrbracket}$, $t \times

Figures (7)

  • Figure 1: Performance achieved by the BO algorithms listed in Section \ref{['sec:num_res']} for (a) the 24d Powell synthetic function, (b) the optimization of the Shannon capacity in a WLAN and (c) the trajectory planning of a rover. The shaded areas indicate the standard error intervals.
  • Figure 2: The factor graph of the decomposition (\ref{['eq:decomposition_example']}). The factor nodes and variable nodes are depicted with squares and circles, respectively. In this decomposition, there are $n = 4$ factors and $d = 3$ variables.
  • Figure 3: Message passing in the factor graph. (Left) The factor nodes compute their variance terms and send them to their variable nodes. Thus, each variable node $j$ receives the variance terms of the factor nodes in $\mathcal{F}_j$. (Right) The variable nodes send all the collected variance terms to each of their factor nodes. Thus, each factor node $i$ receives the variance terms of the factor nodes in $\mathcal{N}_i = \cup_{j \in \mathcal{V}_i} \mathcal{F}_j$.
  • Figure 4: Performance achieved by the studied BO algorithms for (a) the 2d Six-Hump Camel function, (b) the 6d Hartmann function and (c) the 100d Rastrigin function. The shaded areas indicate the standard error intervals.
  • Figure 5: Performance of the studied BO algorithms on the cosmological constants fine-tuning problem.
  • ...and 2 more figures

Theorems & Definitions (18)

  • Proposition 3.4
  • Theorem 3.5
  • Theorem 5.1
  • Theorem 5.2
  • Corollary 5.3
  • Lemma D.1
  • proof
  • Lemma D.2
  • proof
  • Definition F.1: Restricted Prox-Regularity admm_conv
  • ...and 8 more