Hey Robot! Personalizing Robot Navigation through Model Predictive Control with a Large Language Model
Diego Martinez-Baselga, Oscar de Groot, Luzia Knoedler, Javier Alonso-Mora, Luis Riazuelo, Luis Montano
TL;DR
This work tackles the problem of end-user customization of robot navigation in dynamic, human-centered environments. It introduces Hey Robot!, a zero-shot, LLM-enabled architecture that interprets natural language queries (and optionally camera input) to generate and continuously reconfigure the MPC cost function that governs robot motion, while preserving safety via collision constraints. The approach comprises four specialized assistants (Capability, Cost Generation, Camera, Weight Retrieval) that collaborate to produce a suitable $J_{oldsymbol{q}_j}$ and reparameterize the controller in real time, with CasADi/Acados powering the MPC and topology-inspired global search to avoid local minima. Extensive simulations and real-robot experiments demonstrate that user-specified tasks (e.g., following a path, staying distant from humans) are realized with appropriate trade-offs between speed, smoothness, and safety, indicating practical potential for adaptable, user-centric robotic navigation.
Abstract
Robot navigation methods allow mobile robots to operate in applications such as warehouses or hospitals. While the environment in which the robot operates imposes requirements on its navigation behavior, most existing methods do not allow the end-user to configure the robot's behavior and priorities, possibly leading to undesirable behavior (e.g., fast driving in a hospital). We propose a novel approach to adapt robot motion behavior based on natural language instructions provided by the end-user. Our zero-shot method uses an existing Visual Language Model to interpret a user text query or an image of the environment. This information is used to generate the cost function and reconfigure the parameters of a Model Predictive Controller, translating the user's instruction to the robot's motion behavior. This allows our method to safely and effectively navigate in dynamic and challenging environments. We extensively evaluate our method's individual components and demonstrate the effectiveness of our method on a ground robot in simulation and real-world experiments, and across a variety of environments and user specifications.
