Learning Lane Graph Representations for Motion Forecasting
Ming Liang, Bin Yang, Rui Hu, Yun Chen, Renjie Liao, Song Feng, Raquel Urtasun
TL;DR
This work addresses the challenge of motion forecasting in autonomous driving by replacing rasterized map inputs with a structured lane graph derived from vectorized HD-map data. It introduces LaneGCN, a graph-convolutional architecture with multi-type and dilated operations to capture lane topology, and couples it with ActorNet and FusionNet to model rich actor–map interactions. The approach enables explicit, topology-aware map representations and four interaction channels (A2L, L2L, L2A, A2A), achieving substantial improvements on the Argoverse benchmark. The results demonstrate the practical impact of using lane graphs and fusion-based interactions for accurate, multi-modal trajectory prediction in real-world driving scenarios.
Abstract
We propose a motion forecasting model that exploits a novel structured map representation as well as actor-map interactions. Instead of encoding vectorized maps as raster images, we construct a lane graph from raw map data to explicitly preserve the map structure. To capture the complex topology and long range dependencies of the lane graph, we propose LaneGCN which extends graph convolutions with multiple adjacency matrices and along-lane dilation. To capture the complex interactions between actors and maps, we exploit a fusion network consisting of four types of interactions, actor-to-lane, lane-to-lane, lane-to-actor and actor-to-actor. Powered by LaneGCN and actor-map interactions, our model is able to predict accurate and realistic multi-modal trajectories. Our approach significantly outperforms the state-of-the-art on the large scale Argoverse motion forecasting benchmark.
