Transfer Learning Study of Motion Transformer-based Trajectory Predictions
Lars Ullrich, Alex McMaster, Knut Graichen
TL;DR
The paper investigates transfer learning for Motion Transformer (MTR) based trajectory prediction by transferring knowledge from the Waymo Open Motion Dataset (WOMD) to a CarMaker-generated target dataset (CMD). It compares three transfer paradigms—Multi-task Learning, Feature Reuse, and Fine-Tuning (including encoder-only and decoder-only variants)—and evaluates both predictive performance and training-time implications. Results show that fine-tuning, particularly encoder fine-tuning (FTE), yields the best target-domain performance with substantial reductions in training time, while multi-task learning provides limited gains and can suffer from forgetting. The study highlights encoder-focused adaptation as a practical path for real-world deployment and emphasizes the need for larger, diverse datasets to further validate transferability across environments and traffic regulations.
Abstract
Trajectory planning in autonomous driving is highly dependent on predicting the emergent behavior of other road users. Learning-based methods are currently showing impressive results in simulation-based challenges, with transformer-based architectures technologically leading the way. Ultimately, however, predictions are needed in the real world. In addition to the shifts from simulation to the real world, many vehicle- and country-specific shifts, i.e. differences in sensor systems, fusion and perception algorithms as well as traffic rules and laws, are on the agenda. Since models that can cover all system setups and design domains at once are not yet foreseeable, model adaptation plays a central role. Therefore, a simulation-based study on transfer learning techniques is conducted on basis of a transformer-based model. Furthermore, the study aims to provide insights into possible trade-offs between computational time and performance to support effective transfers into the real world.
