Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

Anthony Bardou; Patrick Thiran; Thomas Begin

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

Anthony Bardou, Patrick Thiran, Thomas Begin

TL;DR

This paper tackles high-dimensional Bayesian Optimization by relaxing the common low Maximum Factor Size (MFS) constraint in additive decompositions. It introduces DuMBO, a decentralized, message-passing BO algorithm that models f as a sum of independent factor GPs over a factor graph and uses a tighter decentralized GP-UCB–style acquisition via ADMM to enforce consistency. Theoretical results establish asymptotic no-regret performance, with a high-probability immediate regret bound and a KL-based convergence argument. Empirically, DuMBO matches or surpasses state-of-the-art methods across synthetic and real-world tasks, especially when the true objective has large MFS or unknown additive structure, demonstrating strong scalability and practical impact for high-dimensional, expensive optimization problems.

Abstract

Bayesian Optimization (BO) is typically used to optimize an unknown function $f$ that is noisy and costly to evaluate, by exploiting an acquisition function that must be maximized at each optimization step. Even if provably asymptotically optimal BO algorithms are efficient at optimizing low-dimensional functions, scaling them to high-dimensional spaces remains an open problem, often tackled by assuming an additive structure for $f$. By doing so, BO algorithms typically introduce additional restrictive assumptions on the additive structure that reduce their applicability domain. This paper contains two main contributions: (i) we relax the restrictive assumptions on the additive structure of $f$ without weakening the maximization guarantees of the acquisition function, and (ii) we address the over-exploration problem for decentralized BO algorithms. To these ends, we propose DuMBO, an asymptotically optimal decentralized BO algorithm that achieves very competitive performance against state-of-the-art BO algorithms, especially when the additive structure of $f$ comprises high-dimensional factors.

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

TL;DR

Abstract

Bayesian Optimization (BO) is typically used to optimize an unknown function

that is noisy and costly to evaluate, by exploiting an acquisition function that must be maximized at each optimization step. Even if provably asymptotically optimal BO algorithms are efficient at optimizing low-dimensional functions, scaling them to high-dimensional spaces remains an open problem, often tackled by assuming an additive structure for

. By doing so, BO algorithms typically introduce additional restrictive assumptions on the additive structure that reduce their applicability domain. This paper contains two main contributions: (i) we relax the restrictive assumptions on the additive structure of

without weakening the maximization guarantees of the acquisition function, and (ii) we address the over-exploration problem for decentralized BO algorithms. To these ends, we propose DuMBO, an asymptotically optimal decentralized BO algorithm that achieves very competitive performance against state-of-the-art BO algorithms, especially when the additive structure of

comprises high-dimensional factors.

Paper Structure (34 sections, 10 theorems, 57 equations, 7 figures, 2 tables, 1 algorithm)

This paper contains 34 sections, 10 theorems, 57 equations, 7 figures, 2 tables, 1 algorithm.

Introduction
Background
State of the Art
DuMBO (Decentralized Message-passing Bayesian Optimization algorithm)
Problem Formulation and First Results
Core Assumptions
Inference Formulas
Proposed Acquisition Function
DuMBO
Asymptotic Optimality
Performance Experiments
Optimizing Synthetic Functions
Solving Real-World Problems
Conclusion
Factor Graph
...and 19 more sections

Key Result

Proposition 3.4

Let $\mu^{(i)}_{t+1}(\bm x_{\mathcal{V}_i})$ and $(\sigma_{t+1}^{(i)}(\bm x_{\mathcal{V}_i}))^2$ be the posterior mean and variance of $f^{(i)}$ at input $\bm x_{\mathcal{V}_i}$, respectively. Then, for the decomposition (eq:dec), with $t \times 1$ vectors $\bm k_{\bm x_{\mathcal{V}_i}}^{(i)} = (k^{(i)}(\bm x_{\mathcal{V}_i}, \bm x^j_{\mathcal{V}_i}))_{j \in \llbracket1, t\rrbracket}$, $t \times

Figures (7)

Figure 1: Performance achieved by the BO algorithms listed in Section \ref{['sec:num_res']} for (a) the 24d Powell synthetic function, (b) the optimization of the Shannon capacity in a WLAN and (c) the trajectory planning of a rover. The shaded areas indicate the standard error intervals.
Figure 2: The factor graph of the decomposition (\ref{['eq:decomposition_example']}). The factor nodes and variable nodes are depicted with squares and circles, respectively. In this decomposition, there are $n = 4$ factors and $d = 3$ variables.
Figure 3: Message passing in the factor graph. (Left) The factor nodes compute their variance terms and send them to their variable nodes. Thus, each variable node $j$ receives the variance terms of the factor nodes in $\mathcal{F}_j$. (Right) The variable nodes send all the collected variance terms to each of their factor nodes. Thus, each factor node $i$ receives the variance terms of the factor nodes in $\mathcal{N}_i = \cup_{j \in \mathcal{V}_i} \mathcal{F}_j$.
Figure 4: Performance achieved by the studied BO algorithms for (a) the 2d Six-Hump Camel function, (b) the 6d Hartmann function and (c) the 100d Rastrigin function. The shaded areas indicate the standard error intervals.
Figure 5: Performance of the studied BO algorithms on the cosmological constants fine-tuning problem.
...and 2 more figures

Theorems & Definitions (18)

Proposition 3.4
Theorem 3.5
Theorem 5.1
Theorem 5.2
Corollary 5.3
Lemma D.1
proof
Lemma D.2
proof
Definition F.1: Restricted Prox-Regularity admm_conv
...and 8 more

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

TL;DR

Abstract

Relaxing the Additivity Constraints in Decentralized No-Regret High-Dimensional Bayesian Optimization

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (18)