Safe MPC Alignment with Human Directional Feedback
Zhixian Xie, Wenlong Zhang, Yi Ren, Zhaoran Wang, George J. Pappas, Wanxin Jin
TL;DR
The paper tackles learning safety constraints for model predictive control in robotics by enabling online, human directional feedback to shape a learnable constraint. It introduces Safe MPC Alignment, a certifiable method that updates the constraint by cutting the hypothesis space upon each directional correction, with convergence guarantees and misspecification certification. The approach uses a penalty-based Safe MPC, a linear-in-parameters safety function, and centers of maximum volume ellipsoids to drive efficient learning. Extensive simulations, user studies, and a real-world Franka arm experiment demonstrate data-efficient learning (tens of corrections) and robust performance across tasks, highlighting practical impact for safe robot-AI systems.
Abstract
In safety-critical robot planning or control, manually specifying safety constraints or learning them from demonstrations can be challenging. In this article, we propose a certifiable alignment method for a robot to learn a safety constraint in its model predictive control (MPC) policy from human online directional feedback. To our knowledge, it is the first method to learn safety constraints from human feedback. The proposed method is based on an empirical observation: human directional feedback, when available, tends to guide the robot toward safer regions. The method only requires the direction of human feedback to update the learning hypothesis space. It is certifiable, providing an upper bound on the total number of human feedback in the case of successful learning, or declaring the hypothesis misspecification, i.e., the true safety constraint cannot be found within the specified hypothesis space. We evaluated the proposed method in numerical examples and user studies with two simulation games. Additionally, we tested the proposed method on a real-world Franka robot arm performing mobile water-pouring tasks. The results demonstrate the efficacy and efficiency of our method, showing that it enables a robot to successfully learn safety constraints with a small handful (tens) of human directional corrections.
