Safe Bayesian Optimization for the Control of High-Dimensional Embodied Systems
Yunyue Wei, Zeji Yi, Hongda Li, Saraswati Soedarmadji, Yanan Sui
TL;DR
HdSafeBO tackles safe optimization of high-dimensional embodied-control policies by marrying isometric dimension reduction with a local, optimistic safety strategy and a trust-region search. It provides probabilistic safety guarantees and a bound on cumulative safety violations while enabling efficient learning even when the input space spans hundreds to thousands of dimensions. The approach is validated on synthetic benchmarks, a high-dimensional musculoskeletal control task, and neural-stimulation induced human motion, where it outperforms strong baselines in both objective quality and safety metrics. This framework advances practical online safe optimization for complex, high-dimensional robotic systems and human-robot interaction scenarios.
Abstract
Learning to move is a primary goal for animals and robots, where ensuring safety is often important when optimizing control policies on the embodied systems. For complex tasks such as the control of human or humanoid control, the high-dimensional parameter space adds complexity to the safe optimization effort. Current safe exploration algorithms exhibit inefficiency and may even become infeasible with large high-dimensional input spaces. Furthermore, existing high-dimensional constrained optimization methods neglect safety in the search process. In this paper, we propose High-dimensional Safe Bayesian Optimization with local optimistic exploration (HdSafeBO), a novel approach designed to handle high-dimensional sampling problems under probabilistic safety constraints. We introduce a local optimistic strategy to efficiently and safely optimize the objective function, providing a probabilistic safety guarantee and a cumulative safety violation bound. Through the use of isometric embedding, HdSafeBO addresses problems ranging from a few hundred to several thousand dimensions while maintaining safety guarantees. To our knowledge, HdSafeBO is the first algorithm capable of optimizing the control of high-dimensional musculoskeletal systems with high safety probability. We also demonstrate the real-world applicability of HdSafeBO through its use in the safe online optimization of neural stimulation induced human motion control.
