Table of Contents
Fetching ...

RoboMorph: In-Context Meta-Learning for Robot Dynamics Modeling

Manuel Bianchi Bazzi, Asad Ali Shahid, Christopher Agia, John Alora, Marco Forgione, Dario Piga, Francesco Braghin, Marco Pavone, Loris Roveda

TL;DR

RoboMorph presents an encoder–decoder Transformer that learns a meta-dynamical model for a high-dimensional robot arm without explicit physical parameters. Trained on massively parallel Isaac Gym simulations with domain randomization, it demonstrates in-context learning capabilities to predict end-effector pose and joint angles from torques over long horizons. The work shows zero-shot and few-shot dynamics adaptation within control-action distributions and finds that fine-tuning improves performance on out-of-distribution tasks, signaling potential for integration with Deep Model Predictive Control. While promising, generalization across unfamiliar control actions remains challenging, motivating future transfer learning and pre-compensation approaches for varied morphologies.

Abstract

The landscape of Deep Learning has experienced a major shift with the pervasive adoption of Transformer-based architectures, particularly in Natural Language Processing (NLP). Novel avenues for physical applications, such as solving Partial Differential Equations and Image Vision, have been explored. However, in challenging domains like robotics, where high non-linearity poses significant challenges, Transformer-based applications are scarce. While Transformers have been used to provide robots with knowledge about high-level tasks, few efforts have been made to perform system identification. This paper proposes a novel methodology to learn a meta-dynamical model of a high-dimensional physical system, such as the Franka robotic arm, using a Transformer-based architecture without prior knowledge of the system's physical parameters. The objective is to predict quantities of interest (end-effector pose and joint positions) given the torque signals for each joint. This prediction can be useful as a component for Deep Model Predictive Control frameworks in robotics. The meta-model establishes the correlation between torques and positions and predicts the output for the complete trajectory. This work provides empirical evidence of the efficacy of the in-context learning paradigm, suggesting future improvements in learning the dynamics of robotic systems without explicit knowledge of physical parameters. Code, videos, and supplementary materials can be found at project website. See https://sites.google.com/view/robomorph/

RoboMorph: In-Context Meta-Learning for Robot Dynamics Modeling

TL;DR

RoboMorph presents an encoder–decoder Transformer that learns a meta-dynamical model for a high-dimensional robot arm without explicit physical parameters. Trained on massively parallel Isaac Gym simulations with domain randomization, it demonstrates in-context learning capabilities to predict end-effector pose and joint angles from torques over long horizons. The work shows zero-shot and few-shot dynamics adaptation within control-action distributions and finds that fine-tuning improves performance on out-of-distribution tasks, signaling potential for integration with Deep Model Predictive Control. While promising, generalization across unfamiliar control actions remains challenging, motivating future transfer learning and pre-compensation approaches for varied morphologies.

Abstract

The landscape of Deep Learning has experienced a major shift with the pervasive adoption of Transformer-based architectures, particularly in Natural Language Processing (NLP). Novel avenues for physical applications, such as solving Partial Differential Equations and Image Vision, have been explored. However, in challenging domains like robotics, where high non-linearity poses significant challenges, Transformer-based applications are scarce. While Transformers have been used to provide robots with knowledge about high-level tasks, few efforts have been made to perform system identification. This paper proposes a novel methodology to learn a meta-dynamical model of a high-dimensional physical system, such as the Franka robotic arm, using a Transformer-based architecture without prior knowledge of the system's physical parameters. The objective is to predict quantities of interest (end-effector pose and joint positions) given the torque signals for each joint. This prediction can be useful as a component for Deep Model Predictive Control frameworks in robotics. The meta-model establishes the correlation between torques and positions and predicts the output for the complete trajectory. This work provides empirical evidence of the efficacy of the in-context learning paradigm, suggesting future improvements in learning the dynamics of robotic systems without explicit knowledge of physical parameters. Code, videos, and supplementary materials can be found at project website. See https://sites.google.com/view/robomorph/
Paper Structure (20 sections, 2 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 20 sections, 2 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Encoder-decoder Architecture forgione2023from.
  • Figure 2: Proposed meta-model uses context and input to perform the prediction. Context length is highlighted in green color.
  • Figure 3: 3D trajectories in Cartesian coordinates [m].
  • Figure 4: Multi-sinusoidal torque profiles for joint 0 and joint 1, respectively, and for 20 robots each.
  • Figure 5: Collision detection visualization.
  • ...and 6 more figures