Table of Contents
Fetching ...

Information-Theoretic Safe Bayesian Optimization

Alessandro G. Bottero, Carlos E. Luis, Julia Vinogradska, Felix Berkenkamp, Jan Peters

TL;DR

This work tackles safe Bayesian optimization when the safety constraint is unknown, formulating the problem to maximize an unknown objective while safely exploring the domain. It introduces Information-Theoretic Safe Exploration (ISE), which directly maximizes information gain about parameter safety, and combines it with Max-Value Entropy Search (MES) to yield ISE-BO, a method that naturally handles continuous domains without extra hyperparameters. Theoretical results show that the approach expands the largest reachable safe set and converges to the safe optimum within that set with arbitrary precision, while empirical evaluations demonstrate improved data-efficiency and scalability across synthetic, high-noise, and control tasks. The proposed framework offers a principled, information-driven mechanism for safe exploration and optimization with practical impact for robotics and safety-critical systems.

Abstract

We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an a~priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown functions and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. The combination of this exploration criterion with a well known Bayesian optimization acquisition function yields a novel safe Bayesian optimization selection criterion. Our approach is naturally applicable to continuous domains and does not require additional explicit hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we learn about the value of the safe optimum up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.

Information-Theoretic Safe Bayesian Optimization

TL;DR

This work tackles safe Bayesian optimization when the safety constraint is unknown, formulating the problem to maximize an unknown objective while safely exploring the domain. It introduces Information-Theoretic Safe Exploration (ISE), which directly maximizes information gain about parameter safety, and combines it with Max-Value Entropy Search (MES) to yield ISE-BO, a method that naturally handles continuous domains without extra hyperparameters. Theoretical results show that the approach expands the largest reachable safe set and converges to the safe optimum within that set with arbitrary precision, while empirical evaluations demonstrate improved data-efficiency and scalability across synthetic, high-noise, and control tasks. The proposed framework offers a principled, information-driven mechanism for safe exploration and optimization with practical impact for robotics and safety-critical systems.

Abstract

We consider a sequential decision making task, where the goal is to optimize an unknown function without evaluating parameters that violate an a~priori unknown (safety) constraint. A common approach is to place a Gaussian process prior on the unknown functions and allow evaluations only in regions that are safe with high probability. Most current methods rely on a discretization of the domain and cannot be directly extended to the continuous case. Moreover, the way in which they exploit regularity assumptions about the constraint introduces an additional critical hyperparameter. In this paper, we propose an information-theoretic safe exploration criterion that directly exploits the GP posterior to identify the most informative safe parameters to evaluate. The combination of this exploration criterion with a well known Bayesian optimization acquisition function yields a novel safe Bayesian optimization selection criterion. Our approach is naturally applicable to continuous domains and does not require additional explicit hyperparameters. We theoretically analyze the method and show that we do not violate the safety constraint with high probability and that we learn about the value of the safe optimum up to arbitrary precision. Empirical evaluations demonstrate improved data-efficiency and scalability.
Paper Structure (30 sections, 17 theorems, 44 equations, 11 figures, 1 table, 2 algorithms)

This paper contains 30 sections, 17 theorems, 44 equations, 11 figures, 1 table, 2 algorithms.

Key Result

Lemma 1

Choose $\delta \in (0, 1)$ and let $h$ as defined in eq:h_definition be an element in the RKHS $\mathcal{H}_k$ associated to some kernel $k$, with $\norm{h}_{\mathcal{H}_k} = B < \infty$. Let, moreover, the sequence of positive numbers $\{\beta_n\}$ be as in eq:beta_def. Then if we define the safe s

Figures (11)

  • Figure 1: In (\ref{['fig:problem_statement_example']}) we illustrate the safe optimization task. Based on the unknown safety constraint $s$, we are only allowed to evaluate parameters $\bm x$ with values $s(\bm x)$ above the safety threshold (dashed line). Starting from a safe seed $\bm x_0$ a safe optimization strategy needs to find the optimum of the unknown objective $f$ within the largest reachable safe region of the parameter space containing $\bm x_0$. In (\ref{['fig:mutual_information']}) we show the mutual information $I_n(\{\bm x, y\}; \mathbb{I}_{s(\cdot) \geq 0}(\bm z))$ in green for different parameters $\bm x$ inside the safe set and for a fixed $\bm z$ outside (red dashed line). The exploration part of our algorithm maximizes this quantity jointly over $\bm x$ and $\bm z$.
  • Figure 2: 1D example that illustrates why an explicit exploration component that promotes the expansion of the safe set is crucial in safe BO. Here the function in the top half, $f$, is the unknown objective to optimize, while the lower half shows the unknown safety constrain, $s$. The algorithm that generated the plots in the figure chooses the next parameter to evaluate as the one that maximizes the pure MES acquisition function for $f$, constrained within the current safe set $S_n$. We can see that the safe optimum (pink square) is outside the initial safe set (left plots), but the acquisition function has no interest in evaluating points that could expand the safe set further on the right side of the safe set, since the right boundary of the safe set has clearly a negligible probability of containing the optimum. Instead, the MES acquisition function focuses on the left boundary of the safe set - as we can see in the plots on the right - since that is the likely location of the optimum within the current safe set.
  • Figure 3: Example of the two components of the acquisition function. For simplicity, here the objective $f$ and the constraint $s$ are the same function. The two plots show the values of the safe set expansion component ($\alpha^{\textsc{ISE}\xspace}$) and the optimization component ($\alpha^{\textsc{MES}\xspace}$) of the acquisition function inside the safe set ($S_n$), against the GP posterior mean and confidence interval (the blue curve and shaded area, respectively), with the safety threshold $s(\bm x) = f(\bm x) = 0$ indicated by the orange dashed line. At early stages (\ref{['fig:acquisition_example_1']}) we see that both components achieve their maximum on the boundary of the safe set, since, according to the GP posterior, it is both promising as a region that contains the optimum and as a region that can give us much information about the safety of parameters outside $S_n$. The plot in (\ref{['fig:acquisition_example_2']}), however, shows that, as we continue sampling on the border, $\alpha^{\textsc{MES}\xspace}$ vanishes here, since it is unlikely that the optimum is in these neighborhoods. On the contrary, $\alpha^{\textsc{ISE}\xspace}$ remains non negligible on the boundaries, as these parameters can still give us information about the safety constraint outside of the current $S_n$, possibly leading to its expansion.
  • Figure 4: 1D example that illustrates \ref{['alg:algorithm']}. The top row shows the objective function $f$ with the corresponding posterior GP confidence interval at various iterations, while the bottom row does the same for the safety constraint $s$. The dashed orange line represents the safety threshold, while the black crosses mark the observed values $y_n^{f/s}$ at the evaluated parameters $\bm x_n$. We see how the the acquisition function \ref{['eq:combined_acqusisition_1']} selects the next parameter $\bm x_{n+1}$ (green dashed line) alternating between parameters that are likely to expand the safe set ($n=0$, $n=8$) and parameters that are informative about the current safe optimum ($n=1$, $n=9$), until it is able to identify the reachable safe optimum (pink square).
  • Figure 5: Performance of ISE-BO compared to SafeOpt and both constrained and unconstrained versions of MES, when the objective and constraint function are samples drawn the a GP. In both (\ref{['fig:2d_gp_samples_separate']}) and (\ref{['fig:2d_gp_samples_same']}) we can see the average simple regret \ref{['eq:simple_regret_def']} and its standard error for the tested methods over 50 random seeds. In (\ref{['fig:2d_gp_samples_same']}), the same GP sample served as both the objective $f$ and the constraint $s$, while in (\ref{['fig:2d_gp_samples_separate']}), for each random seed, two different GP samples were drawn to serve, respectively, as the objective and the constraint.
  • ...and 6 more figures

Theorems & Definitions (22)

  • Lemma 1
  • Definition 1: Safe optimum $f^*_\varepsilon$
  • Lemma 2
  • Definition 2: $\bm \beta$-GP$_s^\varepsilon(\Omega)$
  • Definition 3: Expansion operator $R_\varepsilon$
  • Definition 4: Largest reachable safe set $S_\varepsilon(\bm x_0)$
  • Lemma 3
  • Theorem 1
  • Lemma 3
  • Lemma 4
  • ...and 12 more