SM2ITH: Safe Mobile Manipulation with Interactive Human Prediction via Task-Hierarchical Bilevel Model Predictive Control
Francesco D'Orazio, Sepehr Samavi, Xintong Du, Siqi Zhou, Giuseppe Oriolo, Angela P. Schoellig
TL;DR
This work tackles safe, prioritized mobile manipulation in human-centered environments by integrating Hierarchical Task Model Predictive Control with interactive, bilevel human motion prediction. The approach, SM$^2$ITH, couples HTMPC with ORCA-based predictions and discrete safety constraints (DT-CBF) to produce joint robot plans and human trajectories that respect task priorities while remaining collision-free. Extensive experiments on two mobile manipulators across multiple scenarios show that interactive predictions improve safety and efficiency, particularly under crowding and adversarial interactions, outperforming weighted-sum and open-loop baselines. The results demonstrate the practical impact of tightly coupling human-aware prediction with multitask predictive control for real-time, safe, and efficient robot behavior in shared spaces.
Abstract
Mobile manipulators are designed to perform complex sequences of navigation and manipulation tasks in human-centered environments. While recent optimization-based methods such as Hierarchical Task Model Predictive Control (HTMPC) enable efficient multitask execution with strict task priorities, they have so far been applied mainly to static or structured scenarios. Extending these approaches to dynamic human-centered environments requires predictive models that capture how humans react to the actions of the robot. This work introduces Safe Mobile Manipulation with Interactive Human Prediction via Task-Hierarchical Bilevel Model Predictive Control (SM$^2$ITH), a unified framework that combines HTMPC with interactive human motion prediction through bilevel optimization that jointly accounts for robot and human dynamics. The framework is validated on two different mobile manipulators, the Stretch 3 and the Ridgeback-UR10, across three experimental settings: (i) delivery tasks with different navigation and manipulation priorities, (ii) sequential pick-and-place tasks with different human motion prediction models, and (iii) interactions involving adversarial human behavior. Our results highlight how interactive prediction enables safe and efficient coordination, outperforming baselines that rely on weighted objectives or open-loop human models.
