A Generalist Dynamics Model for Control
Ingmar Schubert, Jingwei Zhang, Jake Bruce, Sarah Bechtle, Emilio Parisotto, Martin Riedmiller, Jost Tobias Springenberg, Arunkumar Byravan, Leonard Hasenclever, Nicolas Heess
TL;DR
This work investigates transformer sequence models as dynamics models (TDMs) for control, demonstrating strong cross-environment generalization in few-shot and zero-shot settings while also delivering accurate single-environment predictions for MPC. By tokenizing trajectories and integrating TDMs into MPC with random shooting (and proposal) planners, the authors show TDMs can outperform baselines and even specialist models in data-efficient generalization scenarios. The key contributions include establishing few-shot and zero-shot cross-environment generalization, comparing generalist pre-training strategies, and illustrating the value of planning-based use of dynamics models over direct policy generalization. The findings suggest TDMs as a promising foundation model for robotics, capable of leveraging broad prior experience to accelerate learning and adaptation across diverse tasks and morphologies.
Abstract
We investigate the use of transformer sequence models as dynamics models (TDMs) for control. We find that TDMs exhibit strong generalization capabilities to unseen environments, both in a few-shot setting, where a generalist TDM is fine-tuned with small amounts of data from the target environment, and in a zero-shot setting, where a generalist TDM is applied to an unseen environment without any further training. Here, we demonstrate that generalizing system dynamics can work much better than generalizing optimal behavior directly as a policy. Additional results show that TDMs also perform well in a single-environment learning setting when compared to a number of baseline models. These properties make TDMs a promising ingredient for a foundation model of control.
