Early Stopping Bayesian Optimization for Controller Tuning
David Stenger, Dominik Scheurenberg, Heike Vallery, Sebastian Trimpe
TL;DR
The paper addresses the inefficiency of fixed-length episode evaluations in Bayesian Optimization for controller tuning by introducing Early Stopping BO (ESBO), which terminates episodes when suboptimality is detectable and supplements partial observations with three heuristics. ESBO-C, ESBO-TR, and ESBO-GP offer different strategies to incorporate partial information into the GP surrogate, enabling data-efficient optimization of time-integrated costs $J(\boldsymbol{\theta}) = \sum_{t=1}^{T_{\max}} j(u_t,y_t)$. Empirical results across five simulations and one hardware experiment show that ESBO variants can substantially reduce optimization time (up to ~48% in simulations and ~35% hardware) while maintaining comparable final performance to vanilla BO, with ESBO-GP generally delivering the best trade-off. The work provides a practical, dataset-efficient approach to controller tuning where evaluating full episodes is costly, and introduces a new problem class in BO for control involving early termination and partial data.
Abstract
Manual tuning of performance-critical controller parameters can be tedious and sub-optimal. Bayesian Optimization (BO) is an increasingly popular practical alternative to automatically optimize controller parameters from few experiments. Standard BO practice is to evaluate the closed-loop performance of parameters proposed during optimization on an episode with a fixed length. However, fixed-length episodes can be wasteful. For example, continuing an episode where already the start shows undesirable behavior such as strong oscillations seems pointless. Therefore, we propose a BO method that stops an episode early if suboptimality becomes apparent before an episode is completed. Such early stopping results in partial observations of the controller's performance, which cannot directly be included in standard BO. We propose three heuristics to facilitate partially observed episodes in BO. Through five numerical and one hardware experiment, we demonstrate that early stopping BO can substantially reduce the time needed for optimization.
