Table of Contents
Fetching ...

FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding

Mingkun Wang, Xiaoguang Ren, Ruochun Jin, Minglong Li, Xiaochuan Zhang, Changqian Yu, Mingxu Wang, Wenjing Yang

TL;DR

This paper tackles the challenge of predicting future driving behavior by explicitly encoding future scenarios within a unified framework. It introduces FutureNet, which injects initially predicted trajectories into the future context, and Lane Occupancy Field (LOF), a lane-semantic occupancy representation that captures the joint spatial-temporal occupancy of all agents. The authors propose FutureNet-LOF, a joint network that combines trajectory prediction with LOF via future-context encoding implemented through multi-parallel local worlds and recurrent/refinement decoding, achieving state-of-the-art results on Argoverse 1 and 2 (and multi-world on Argoverse 2). Ablations show that encoding future context, map-aware future interactions, and LOF supervision collectively improve long-horizon accuracy and joint scene understanding, underscoring the practical significance for safe, coordinated autonomous driving.

Abstract

Most prior motion prediction endeavors in autonomous driving have inadequately encoded future scenarios, leading to predictions that may fail to accurately capture the diverse movements of agents (e.g., vehicles or pedestrians). To address this, we propose FutureNet, which explicitly integrates initially predicted trajectories into the future scenario and further encodes these future contexts to enhance subsequent forecasting. Additionally, most previous motion forecasting works have focused on predicting independent futures for each agent. However, safe and smooth autonomous driving requires accurately predicting the diverse future behaviors of numerous surrounding agents jointly in complex dynamic environments. Given that all agents occupy certain potential travel spaces and possess lane driving priority, we propose Lane Occupancy Field (LOF), a new representation with lane semantics for motion forecasting in autonomous driving. LOF can simultaneously capture the joint probability distribution of all road participants' future spatial-temporal positions. Due to the high compatibility between lane occupancy field prediction and trajectory prediction, we propose a novel network with future context encoding for the joint prediction of these two tasks. Our approach ranks 1st on two large-scale motion forecasting benchmarks: Argoverse 1 and Argoverse 2.

FutureNet-LOF: Joint Trajectory Prediction and Lane Occupancy Field Prediction with Future Context Encoding

TL;DR

This paper tackles the challenge of predicting future driving behavior by explicitly encoding future scenarios within a unified framework. It introduces FutureNet, which injects initially predicted trajectories into the future context, and Lane Occupancy Field (LOF), a lane-semantic occupancy representation that captures the joint spatial-temporal occupancy of all agents. The authors propose FutureNet-LOF, a joint network that combines trajectory prediction with LOF via future-context encoding implemented through multi-parallel local worlds and recurrent/refinement decoding, achieving state-of-the-art results on Argoverse 1 and 2 (and multi-world on Argoverse 2). Ablations show that encoding future context, map-aware future interactions, and LOF supervision collectively improve long-horizon accuracy and joint scene understanding, underscoring the practical significance for safe, coordinated autonomous driving.

Abstract

Most prior motion prediction endeavors in autonomous driving have inadequately encoded future scenarios, leading to predictions that may fail to accurately capture the diverse movements of agents (e.g., vehicles or pedestrians). To address this, we propose FutureNet, which explicitly integrates initially predicted trajectories into the future scenario and further encodes these future contexts to enhance subsequent forecasting. Additionally, most previous motion forecasting works have focused on predicting independent futures for each agent. However, safe and smooth autonomous driving requires accurately predicting the diverse future behaviors of numerous surrounding agents jointly in complex dynamic environments. Given that all agents occupy certain potential travel spaces and possess lane driving priority, we propose Lane Occupancy Field (LOF), a new representation with lane semantics for motion forecasting in autonomous driving. LOF can simultaneously capture the joint probability distribution of all road participants' future spatial-temporal positions. Due to the high compatibility between lane occupancy field prediction and trajectory prediction, we propose a novel network with future context encoding for the joint prediction of these two tasks. Our approach ranks 1st on two large-scale motion forecasting benchmarks: Argoverse 1 and Argoverse 2.
Paper Structure (31 sections, 35 equations, 46 figures, 10 tables)

This paper contains 31 sections, 35 equations, 46 figures, 10 tables.

Figures (46)

  • Figure 1: The architecture of our proposed FutureNet-LOF network.
  • Figure 2: Trajectory prediction qualitative results on the Argoverse 2 validation set. The self-driving car is depicted by a green bounding box, while the focal agent's box and ground-truth trajectories are displayed in orange. Predicted trajectories are shown in green.
  • Figure 3: Lane occupancy field prediction qualitative results on the Argoverse 2 validation set. The ground-truth lane occupancy fields are displayed in red. Our predicted fields are shown in blue. The green stars represent lane occupancy field rendered based on our predicted trajectories adopting a distance threshold of 2 meters.
  • Figure 4: Scene context.
  • Figure 5: (1) Map point elements.
  • ...and 41 more figures