Table of Contents
Fetching ...

Information Theoretic Bayesian Optimization over the Probability Simplex

Federico Pavesi, Antonio Candelieri, Noémie Jaquier

TL;DR

This paper introduces $\alpha$-GaBO, a novel family of Bayesian optimization algorithms over the probability simplex, grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections.

Abstract

Bayesian optimization is a data-efficient technique that has been shown to be extremely powerful to optimize expensive, black-box, and possibly noisy objective functions. Many applications involve optimizing probabilities and mixtures which naturally belong to the probability simplex, a constrained non-Euclidean domain defined by non-negative entries summing to one. This paper introduces $α$-GaBO, a novel family of Bayesian optimization algorithms over the probability simplex. Our approach is grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections. Based on information geometry theory, we construct Matérn kernels that reflect the geometry of the probability simplex, as well as a one-parameter family of geometric optimizers for the acquisition function. We validate our method on benchmark functions and on a variety of real-world applications including mixtures of components, mixtures of classifiers, and a robotic control task, showing its increased performance compared to constrained Euclidean approaches.

Information Theoretic Bayesian Optimization over the Probability Simplex

TL;DR

This paper introduces -GaBO, a novel family of Bayesian optimization algorithms over the probability simplex, grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections.

Abstract

Bayesian optimization is a data-efficient technique that has been shown to be extremely powerful to optimize expensive, black-box, and possibly noisy objective functions. Many applications involve optimizing probabilities and mixtures which naturally belong to the probability simplex, a constrained non-Euclidean domain defined by non-negative entries summing to one. This paper introduces -GaBO, a novel family of Bayesian optimization algorithms over the probability simplex. Our approach is grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections. Based on information geometry theory, we construct Matérn kernels that reflect the geometry of the probability simplex, as well as a one-parameter family of geometric optimizers for the acquisition function. We validate our method on benchmark functions and on a variety of real-world applications including mixtures of components, mixtures of classifiers, and a robotic control task, showing its increased performance compared to constrained Euclidean approaches.
Paper Structure (28 sections, 1 theorem, 32 equations, 5 figures, 2 algorithms)

This paper contains 28 sections, 1 theorem, 32 equations, 5 figures, 2 algorithms.

Key Result

Proposition 0

Let a weight matrix $\Pi \in \mathbb{R}^{N \times K}$ with columns $\Pi_{k} \in \Delta^{N-1}$ for each $k$ and define the task weights as for all $i$. Then $[\alpha_1, \dots, \alpha_N] \in \Delta^{N-1}$.

Figures (5)

  • Figure 1: $\alpha$-GaBO leverages the sphere map $\varphi$, which establishes an isometry between the probability simplex $\Delta^{d}$ and the positive orthant $\mathbb{S}^{d}_{\geq0}$ of the sphere. BO on the simplex is performed via equivalent representations on the sphere.
  • Figure 2: Logarithm of the regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () on benchmark functions.
  • Figure 3: Regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () on optimal mixture datsets.
  • Figure 7: Regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () on mixture of classifiers for wall-following robot navigation dataset on $\Delta^{7}$. All models outperform the best standalone classifier ().
  • Figure 8: Left: Regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () for robotic multi-task control. Right: Snapshots of the optimal trajectory with target left () and right () hand positions.

Theorems & Definitions (2)

  • Proposition 0
  • proof