LMFormer: Lane based Motion Prediction Transformer
Harsh Yadav, Maximilian Schaefer, Kun Zhao, Tobias Meisen
TL;DR
LMFormer tackles lane-aware trajectory prediction by embedding lane connectivity into a transformer through lane-aware cross-attention and a GNN-like map encoder. It introduces iterative refinement across stacked transformer layers and learnable mode queries to produce diverse, lane-consistent trajectories while maintaining computational efficiency. Evaluations on nuScenes and Deep Scenario show state-of-the-art results and cross-dataset generalization when trained on combined data, highlighting the value of explicit lane topology and scalable training. The work emphasizes explainability via attention maps and identifies areas for future improvement in velocity profiling and maneuver diversity.
Abstract
Motion prediction plays an important role in autonomous driving. This study presents LMFormer, a lane-aware transformer network for trajectory prediction tasks. In contrast to previous studies, our work provides a simple mechanism to dynamically prioritize the lanes and shows that such a mechanism introduces explainability into the learning behavior of the network. Additionally, LMFormer uses the lane connection information at intersections, lane merges, and lane splits, in order to learn long-range dependency in lane structure. Moreover, we also address the issue of refining the predicted trajectories and propose an efficient method for iterative refinement through stacked transformer layers. For benchmarking, we evaluate LMFormer on the nuScenes dataset and demonstrate that it achieves SOTA performance across multiple metrics. Furthermore, the Deep Scenario dataset is used to not only illustrate cross-dataset network performance but also the unification capabilities of LMFormer to train on multiple datasets and achieve better performance.
