Information Theoretic Bayesian Optimization over the Probability Simplex

Federico Pavesi; Antonio Candelieri; Noémie Jaquier

Information Theoretic Bayesian Optimization over the Probability Simplex

Federico Pavesi, Antonio Candelieri, Noémie Jaquier

TL;DR

This paper introduces $\alpha$-GaBO, a novel family of Bayesian optimization algorithms over the probability simplex, grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections.

Abstract

Bayesian optimization is a data-efficient technique that has been shown to be extremely powerful to optimize expensive, black-box, and possibly noisy objective functions. Many applications involve optimizing probabilities and mixtures which naturally belong to the probability simplex, a constrained non-Euclidean domain defined by non-negative entries summing to one. This paper introduces $α$-GaBO, a novel family of Bayesian optimization algorithms over the probability simplex. Our approach is grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections. Based on information geometry theory, we construct Matérn kernels that reflect the geometry of the probability simplex, as well as a one-parameter family of geometric optimizers for the acquisition function. We validate our method on benchmark functions and on a variety of real-world applications including mixtures of components, mixtures of classifiers, and a robotic control task, showing its increased performance compared to constrained Euclidean approaches.

Information Theoretic Bayesian Optimization over the Probability Simplex

TL;DR

This paper introduces

-GaBO, a novel family of Bayesian optimization algorithms over the probability simplex, grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections.

Abstract

-GaBO, a novel family of Bayesian optimization algorithms over the probability simplex. Our approach is grounded in information geometry, a branch of Riemannian geometry which endows the simplex with a Riemannian metric and a class of connections. Based on information geometry theory, we construct Matérn kernels that reflect the geometry of the probability simplex, as well as a one-parameter family of geometric optimizers for the acquisition function. We validate our method on benchmark functions and on a variety of real-world applications including mixtures of components, mixtures of classifiers, and a robotic control task, showing its increased performance compared to constrained Euclidean approaches.

Paper Structure (28 sections, 1 theorem, 32 equations, 5 figures, 2 algorithms)

This paper contains 28 sections, 1 theorem, 32 equations, 5 figures, 2 algorithms.

Introduction
Related work
Optimization of convex functions.
Optimization of black-box functions.
Background
Riemannian manifolds
Information geometry of the probability simplex
Bayesian optimization
Geometry-aware BO
Kernels on manifolds.
Acquisition function optimization on manifolds.
$\alpha$-simplex Geometry-aware Bayesian Optimization
Kernels on the probability simplex
Acquisition function optimization
Experiments
...and 13 more sections

Key Result

Proposition 0

Let a weight matrix $\Pi \in \mathbb{R}^{N \times K}$ with columns $\Pi_{k} \in \Delta^{N-1}$ for each $k$ and define the task weights as for all $i$. Then $[\alpha_1, \dots, \alpha_N] \in \Delta^{N-1}$.

Figures (5)

Figure 1: $\alpha$-GaBO leverages the sphere map $\varphi$, which establishes an isometry between the probability simplex $\Delta^{d}$ and the positive orthant $\mathbb{S}^{d}_{\geq0}$ of the sphere. BO on the simplex is performed via equivalent representations on the sphere.
Figure 2: Logarithm of the regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () on benchmark functions.
Figure 3: Regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () on optimal mixture datsets.
Figure 7: Regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () on mixture of classifiers for wall-following robot navigation dataset on $\Delta^{7}$. All models outperform the best standalone classifier ().
Figure 8: Left: Regret (median and quartiles) and distribution over the final recommendation for $\alpha_0$-GaBO (), $\alpha_{\text{-}1}$-GaBO (), $\mathbb{S}^{d}$-Eucl. BO (), and BORIS () for robotic multi-task control. Right: Snapshots of the optimal trajectory with target left () and right () hand positions.

Theorems & Definitions (2)

Proposition 0
proof

Information Theoretic Bayesian Optimization over the Probability Simplex

TL;DR

Abstract

Information Theoretic Bayesian Optimization over the Probability Simplex

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (2)