NARRATE: Versatile Language Architecture for Optimal Control in Robotics
Seif Ismail, Antonio Arbues, Ryan Cotterell, René Zurbrügg, Carmen Amo Alonso
TL;DR
NARRATE presents a modular framework that uses a pre-trained LLM to translate natural language goals into hard-constraint MPC formulations, enabling safe and flexible robotic control. By dividing the Language Module into a Task Planner and an Optimization Designer and pairing it with an MPC-based Trajectory Generator and impedance-based Trajectory Tracker, the method handles long-horizon, contact-rich tasks and transfers from simulation to real robots. Empirical results show superior performance and efficiency compared to state-of-the-art language-to-action baselines, with notable gains when constraints and human feedback are incorporated. The work demonstrates practical viability for natural-language interfaces in manipulation, while highlighting avenues for safety guarantees and improved perception-driven feedback.
Abstract
The impressive capabilities of Large Language Models (LLMs) have led to various efforts to enable robots to be controlled through natural language instructions, opening exciting possibilities for human-robot interaction The goal is for the motor-control task to be performed accurately, efficiently and safely while also enjoying the flexibility imparted by LLMs to specify and adjust the task through natural language. In this work, we demonstrate how a careful layering of an LLM in combination with a Model Predictive Control (MPC) formulation allows for accurate and flexible robotic control via natural language while taking into consideration safety constraints. In particular, we rely on the LLM to effectively frame constraints and objective functions as mathematical expressions, which are later used in the motor-control module via MPC. The transparency of the optimization formulation allows for interpretability of the task and enables adjustments through human feedback. We demonstrate the validity of our method through extensive experiments on long-horizon reasoning, contact-rich, and multi-object interaction tasks. Our evaluations show that NARRATE outperforms current existing methods on these benchmarks and effectively transfers to the real world on two different embodiments. Videos, Code and Prompts at narrate-mpc.github.io
