Table of Contents
Fetching ...

PACSBO: Probably approximately correct safe Bayesian optimization

Abdullah Tokmak, Thomas B. Schön, Dominik Baumann

TL;DR

This work addresses safety-constrained optimization with unknown system dynamics by removing the need for a known RKHS-norm bound and replacing it with a data-driven estimator. The authors introduce PACSBO, which combines RKHS-norm estimation via a learned predictor with a local interpretation of the RKHS norm to reduce conservatism and improve exploration. They provide a PAC-style theoretical guarantee for the norm over-estimation and demonstrate improvements over SafeOpt in both simulations and a Furuta pendulum hardware experiment. The approach offers a practical path toward safer, more data-efficient BO in real-world control settings, with potential for extension to more scalable designs and predictor refinements.

Abstract

Safe Bayesian optimization (BO) algorithms promise to find optimal control policies without knowing the system dynamics while at the same time guaranteeing safety with high probability. In exchange for those guarantees, popular algorithms require a smoothness assumption: a known upper bound on a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it is unclear how to, in practice, obtain an upper bound of an unknown function in its corresponding RKHS. In response, we propose an algorithm that estimates an upper bound on the RKHS norm of an unknown function from data and investigate its theoretical properties. Moreover, akin to Lipschitz-based methods, we treat the RKHS norm as a local rather than a global object, and thus reduce conservatism. Integrating the RKHS norm estimation and the local interpretation of the RKHS norm into a safe BO algorithm yields PACSBO, an algorithm for probably approximately correct safe Bayesian optimization, for which we provide numerical and hardware experiments that demonstrate its applicability and benefits over popular safe BO algorithms.

PACSBO: Probably approximately correct safe Bayesian optimization

TL;DR

This work addresses safety-constrained optimization with unknown system dynamics by removing the need for a known RKHS-norm bound and replacing it with a data-driven estimator. The authors introduce PACSBO, which combines RKHS-norm estimation via a learned predictor with a local interpretation of the RKHS norm to reduce conservatism and improve exploration. They provide a PAC-style theoretical guarantee for the norm over-estimation and demonstrate improvements over SafeOpt in both simulations and a Furuta pendulum hardware experiment. The approach offers a practical path toward safer, more data-efficient BO in real-world control settings, with potential for extension to more scalable designs and predictor refinements.

Abstract

Safe Bayesian optimization (BO) algorithms promise to find optimal control policies without knowing the system dynamics while at the same time guaranteeing safety with high probability. In exchange for those guarantees, popular algorithms require a smoothness assumption: a known upper bound on a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it is unclear how to, in practice, obtain an upper bound of an unknown function in its corresponding RKHS. In response, we propose an algorithm that estimates an upper bound on the RKHS norm of an unknown function from data and investigate its theoretical properties. Moreover, akin to Lipschitz-based methods, we treat the RKHS norm as a local rather than a global object, and thus reduce conservatism. Integrating the RKHS norm estimation and the local interpretation of the RKHS norm into a safe BO algorithm yields PACSBO, an algorithm for probably approximately correct safe Bayesian optimization, for which we provide numerical and hardware experiments that demonstrate its applicability and benefits over popular safe BO algorithms.
Paper Structure (13 sections, 2 theorems, 6 equations, 6 figures, 3 algorithms)

This paper contains 13 sections, 2 theorems, 6 equations, 6 figures, 3 algorithms.

Key Result

theorem thmcountertheorem

Consider any $i \in \mathcal{I}$ and let Assumption asm:expectation hold. If then $B_i\geq \|h(\cdot, i)\|_k$ with a probability of at least $1-\delta$, where $\delta\in(0,1)$.

Figures (6)

  • Figure 1: Different RKHS norms.
  • Figure 2: RKHS norm over $r_{\mathcal{A},A}$.
  • Figure 3: Random RKHS functions and their RKHS norms. The upper sub-figures show the ground truth $h(\cdot,i)$ (red line) and the random RKHS functions $\rho_{A,j}(\cdot,i)$ (blue lines) given samples $A$ (black dots), while the lower sub-figures illustrate the frequency of the RKHS norms $\|\rho_{A,j}(\cdot,i)\|_k$. For an increasing sample set, $\|\rho_{A,j}(\cdot,i)\|_k$ approach $\|h(\cdot,i)\|_k=1$, whereas $\|\rho_{A,j}(\cdot,i)\|$ tends to conservatively over-estimate $\|h(\cdot,i)\|_k$ for fewer samples. We used the Matérn-32 kernel with length scale $\ell=0.1$, $\hat{N}=100$, and $\overline\alpha=1$. We sampled the parameters $A$ randomly from a uniform distribution. The upper sub-figures show a random subset of 30 out of the $q=5\cdot 10^3$ random RKHS functions.
  • Figure 4: Pacsbo vs. SafeOpt. A conservative smoothness assumption allows Pacsbo (upper figures) to explore faster than SafeOpt (lower figures).
  • Figure 5: Pacsbo vs. SafeOpt. An optimistic smoothness assumption yields unsafe experiments (red cross) with SafeOpt (bottom), whereas Pacsbo (top) stays safe.
  • ...and 1 more figures

Theorems & Definitions (7)

  • remark thmcounterremark
  • theorem thmcountertheorem
  • proof : Idea
  • remark thmcounterremark
  • remark thmcounterremark
  • proposition thmcounterproposition
  • proof