Robust Entropy Search for Safe Efficient Bayesian Optimization

Dorina Weichert; Alexander Kister; Sebastian Houben; Patrick Link; Gunar Ernis

Robust Entropy Search for Safe Efficient Bayesian Optimization

Dorina Weichert, Alexander Kister, Sebastian Houben, Patrick Link, Gunar Ernis

TL;DR

This paper tackles robust Bayesian Optimization under adversarial perturbations by introducing Robust Entropy Search (RES), an information-based acquisition that targets the robustness characteristics $\big(\bm{h}, g, f^\star\big)_f$ via mutual information. RES uses random Fourier feature function samples to efficiently generate candidate robust functions, computes a conditioned posterior that enforces worst-case constraints, and assembles an acquisition that reduces uncertainty about the robust optimum. Empirical results on synthetic benchmarks and real-world problems (e.g., FEM parameter calibration and robot pushing) show RES consistently outperforms non-robust baselines and StableOpt in finding robust optima with fewer evaluations. The approach offers a practical, hyperparameter-free, sample-efficient pathway to safe, robust optimization in engineering and robotics, with potential extensions to model selection, multi-fidelity, and multi-objective settings.

Abstract

The practical use of Bayesian Optimization (BO) in engineering applications imposes special requirements: high sampling efficiency on the one hand and finding a robust solution on the other hand. We address the case of adversarial robustness, where all parameters are controllable during the optimization process, but a subset of them is uncontrollable or even adversely perturbed at the time of application. To this end, we develop an efficient information-based acquisition function that we call Robust Entropy Search (RES). We empirically demonstrate its benefits in experiments on synthetic and real-life data. The results showthat RES reliably finds robust optima, outperforming state-of-the-art algorithms.

Robust Entropy Search for Safe Efficient Bayesian Optimization

TL;DR

via mutual information. RES uses random Fourier feature function samples to efficiently generate candidate robust functions, computes a conditioned posterior that enforces worst-case constraints, and assembles an acquisition that reduces uncertainty about the robust optimum. Empirical results on synthetic benchmarks and real-world problems (e.g., FEM parameter calibration and robot pushing) show RES consistently outperforms non-robust baselines and StableOpt in finding robust optima with fewer evaluations. The approach offers a practical, hyperparameter-free, sample-efficient pathway to safe, robust optimization in engineering and robotics, with potential extensions to model selection, multi-fidelity, and multi-objective settings.

Abstract

Paper Structure (41 sections, 21 equations, 11 figures, 3 tables, 2 algorithms)

This paper contains 41 sections, 21 equations, 11 figures, 3 tables, 2 algorithms.

Introduction
Motivation
Contributions
Related Work
Background
Gaussian Process Regression
Properties of the Robust Optimum
Robust Entropy Search
Methodical Idea
Implementation
Efficient Treatment of Function Samples
Calculating the Conditioned Posterior Probability Distribution
Step 1: Conditioning the at the training data points.
Step 2: Creating a predictive distribution for a new location $\bm{z}$.
Step 3: Conditioning the predictions.
...and 26 more sections

Figures (11)

Figure 1: Two-dimensional objective function $f(\bm{x}, \bm{\theta})$ and derived maximizing function $g(\bm{x}) = \max_{\bm{\theta}} f(\bm{x}, \bm{\theta})$. In the given example, the location of the global robust optimum ($\blacklozenge$) is ambiguous. The optima are neither the global maximum ($\blacktriangleright$), the global minimum ($\lhd$) nor the smallest local min max point ($\bullet$). The values of the argmax function $\bm{h}(\bm{x})$ are rendered as a white line in figure \ref{['fig:1a']}. The function values at these points define the maximizing function $g(\bm{x})$, given in figure \ref{['fig:1b']}.
Figure 2: Predictive distributions (mean and one standard deviation) before and after conditioning for a single uncontrollable parameter with two values $\mathbb{\theta}_1$ (blue) and $\mathbb{\theta}_2$ (orange). In this case, $\bm{h}_c(\bm{x}) = \mathbb{\theta}_1~\forall~\bm{x}$ (blue) with the max function $f_c(\bm{x}, \theta_1) = g_c(\bm{x})$ (black). While the predictive distribution $f(\bm{x}, \mathbb{\theta}_2)$ is only upper bounded by the sample $g_c(\bm{x})$, $f(\bm{x}, \mathbb{\theta}_1)$ is upper bounded by the sample $g_c(\bm{x})$ and lower bounded by the optimum $f_c^\star$ ($\blacklozenge$).
Figure 3: Regrets for the two-dimensional, continuous within-model comparison. We present the median and the upper and lower quartiles for 50 mean functions. The number after the algorithm indicates the value of the hyperparameter ($C$ for and $\sqrt{\beta}$ for StableOpt). The results indicate the failure of the non-robust methods as well as the fact that acquisition function is slightly better than StableOpt with the advantage of being hyperparameter-free.
Figure 4: Results of the experiments with synthetic functions. The marker after the name of the problem indicates the dimensionality, the type of input space (continuous or discrete) and, if discrete, the number of discrete parameters. For StableOpt, we give the value of the exploration constant $\sqrt{\beta}$ after the algorithm name. Our approach, , with a number samples of $C =1$, shows superior performance on nearly all problems.
Figure 5: Deep drawing: schematic illustration, force-time-diagrams and regret curves for simulations with different parameters.
...and 6 more figures

Robust Entropy Search for Safe Efficient Bayesian Optimization

TL;DR

Abstract

Robust Entropy Search for Safe Efficient Bayesian Optimization

Authors

TL;DR

Abstract

Table of Contents

Figures (11)