RoboMorph: In-Context Meta-Learning for Robot Dynamics Modeling
Manuel Bianchi Bazzi, Asad Ali Shahid, Christopher Agia, John Alora, Marco Forgione, Dario Piga, Francesco Braghin, Marco Pavone, Loris Roveda
TL;DR
RoboMorph presents an encoder–decoder Transformer that learns a meta-dynamical model for a high-dimensional robot arm without explicit physical parameters. Trained on massively parallel Isaac Gym simulations with domain randomization, it demonstrates in-context learning capabilities to predict end-effector pose and joint angles from torques over long horizons. The work shows zero-shot and few-shot dynamics adaptation within control-action distributions and finds that fine-tuning improves performance on out-of-distribution tasks, signaling potential for integration with Deep Model Predictive Control. While promising, generalization across unfamiliar control actions remains challenging, motivating future transfer learning and pre-compensation approaches for varied morphologies.
Abstract
The landscape of Deep Learning has experienced a major shift with the pervasive adoption of Transformer-based architectures, particularly in Natural Language Processing (NLP). Novel avenues for physical applications, such as solving Partial Differential Equations and Image Vision, have been explored. However, in challenging domains like robotics, where high non-linearity poses significant challenges, Transformer-based applications are scarce. While Transformers have been used to provide robots with knowledge about high-level tasks, few efforts have been made to perform system identification. This paper proposes a novel methodology to learn a meta-dynamical model of a high-dimensional physical system, such as the Franka robotic arm, using a Transformer-based architecture without prior knowledge of the system's physical parameters. The objective is to predict quantities of interest (end-effector pose and joint positions) given the torque signals for each joint. This prediction can be useful as a component for Deep Model Predictive Control frameworks in robotics. The meta-model establishes the correlation between torques and positions and predicts the output for the complete trajectory. This work provides empirical evidence of the efficacy of the in-context learning paradigm, suggesting future improvements in learning the dynamics of robotic systems without explicit knowledge of physical parameters. Code, videos, and supplementary materials can be found at project website. See https://sites.google.com/view/robomorph/
