PACSBO: Probably approximately correct safe Bayesian optimization
Abdullah Tokmak, Thomas B. Schön, Dominik Baumann
TL;DR
This work addresses safety-constrained optimization with unknown system dynamics by removing the need for a known RKHS-norm bound and replacing it with a data-driven estimator. The authors introduce PACSBO, which combines RKHS-norm estimation via a learned predictor with a local interpretation of the RKHS norm to reduce conservatism and improve exploration. They provide a PAC-style theoretical guarantee for the norm over-estimation and demonstrate improvements over SafeOpt in both simulations and a Furuta pendulum hardware experiment. The approach offers a practical path toward safer, more data-efficient BO in real-world control settings, with potential for extension to more scalable designs and predictor refinements.
Abstract
Safe Bayesian optimization (BO) algorithms promise to find optimal control policies without knowing the system dynamics while at the same time guaranteeing safety with high probability. In exchange for those guarantees, popular algorithms require a smoothness assumption: a known upper bound on a norm in a reproducing kernel Hilbert space (RKHS). The RKHS is a potentially infinite-dimensional space, and it is unclear how to, in practice, obtain an upper bound of an unknown function in its corresponding RKHS. In response, we propose an algorithm that estimates an upper bound on the RKHS norm of an unknown function from data and investigate its theoretical properties. Moreover, akin to Lipschitz-based methods, we treat the RKHS norm as a local rather than a global object, and thus reduce conservatism. Integrating the RKHS norm estimation and the local interpretation of the RKHS norm into a safe BO algorithm yields PACSBO, an algorithm for probably approximately correct safe Bayesian optimization, for which we provide numerical and hardware experiments that demonstrate its applicability and benefits over popular safe BO algorithms.
