Annealed Winner-Takes-All for Motion Forecasting
Yihong Xu, Victor Letzelter, Mickaël Chen, Éloi Zablocki, Matthieu Cord
TL;DR
This work tackles the instability and mode-collapse issues of Winner-Takes-All training in multi-hypothesis motion forecasting. By integrating annealed Winner-Takes-All (aWTA) with a softmin-based, temperature-controlled weighting, the authors enable diverse future predictions while using a fixed small set of hypotheses and eliminating post-selection. Across two large real-world datasets and two modern trajectory predictors, aWTA yields consistent improvements in key metrics and exhibits phase-transition dynamics as training progresses. The approach is straightforward to plug into existing transformer-based forecasting models and reduces training/inference complexity, with code released for community use.
Abstract
In autonomous driving, motion prediction aims at forecasting the future trajectories of nearby agents, helping the ego vehicle to anticipate behaviors and drive safely. A key challenge is generating a diverse set of future predictions, commonly addressed using data-driven models with Multiple Choice Learning (MCL) architectures and Winner-Takes-All (WTA) training objectives. However, these methods face initialization sensitivity and training instabilities. Additionally, to compensate for limited performance, some approaches rely on training with a large set of hypotheses, requiring a post-selection step during inference to significantly reduce the number of predictions. To tackle these issues, we take inspiration from annealed MCL, a recently introduced technique that improves the convergence properties of MCL methods through an annealed Winner-Takes-All loss (aWTA). In this paper, we demonstrate how the aWTA loss can be integrated with state-of-the-art motion forecasting models to enhance their performance using only a minimal set of hypotheses, eliminating the need for the cumbersome post-selection step. Our approach can be easily incorporated into any trajectory prediction model normally trained using WTA and yields significant improvements. To facilitate the application of our approach to future motion forecasting models, the code is made publicly available: https://github.com/valeoai/MF_aWTA.
