Table of Contents
Fetching ...

Dynamic Intent Queries for Motion Transformer-based Trajectory Prediction

Tobias Demmler, Lennart Hartung, Andreas Tamke, Thao Dang, Alexander Hegai, Karsten Haug, Lars Mikelsons

TL;DR

Dynamic Intent Queries replace MTR's scene-agnostic static endpoints with scene-aware dynamic intention points derived from lane association and a road graph, aiming to align predictions with map constraints. The authors implement lane association, road-graph generation, and 64-point K-Means sampling, plus a hybrid dynamic-static approach, and evaluate on the Waymo Open Motion Dataset, observing notable gains in long-horizon accuracy and map-conform behavior. They also analyze cross-class effects, showing improvements for pedestrians and cyclists when vehicle intents are made dynamic, while acknowledging limitations in illegal maneuvers not captured by the road graph. The work suggests future heterogeneous architectures to mitigate cross-effects across object classes and improve overall robustness in real-world autonomous driving stacks.

Abstract

In autonomous driving, accurately predicting the movements of other traffic participants is crucial, as it significantly influences a vehicle's planning processes. Modern trajectory prediction models strive to interpret complex patterns and dependencies from agent and map data. The Motion Transformer (MTR) architecture and subsequent work define the most accurate methods in common benchmarks such as the Waymo Open Motion Benchmark. The MTR model employs pre-generated static intention points as initial goal points for trajectory prediction. However, the static nature of these points frequently leads to misalignment with map data in specific traffic scenarios, resulting in unfeasible or unrealistic goal points. Our research addresses this limitation by integrating scene-specific dynamic intention points into the MTR model. This adaptation of the MTR model was trained and evaluated on the Waymo Open Motion Dataset. Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory prediction accuracy, especially for predictions over long time horizons. Furthermore, we analyze the impact on ground truth trajectories which are not compliant with the map data or are illegal maneuvers.

Dynamic Intent Queries for Motion Transformer-based Trajectory Prediction

TL;DR

Dynamic Intent Queries replace MTR's scene-agnostic static endpoints with scene-aware dynamic intention points derived from lane association and a road graph, aiming to align predictions with map constraints. The authors implement lane association, road-graph generation, and 64-point K-Means sampling, plus a hybrid dynamic-static approach, and evaluate on the Waymo Open Motion Dataset, observing notable gains in long-horizon accuracy and map-conform behavior. They also analyze cross-class effects, showing improvements for pedestrians and cyclists when vehicle intents are made dynamic, while acknowledging limitations in illegal maneuvers not captured by the road graph. The work suggests future heterogeneous architectures to mitigate cross-effects across object classes and improve overall robustness in real-world autonomous driving stacks.

Abstract

In autonomous driving, accurately predicting the movements of other traffic participants is crucial, as it significantly influences a vehicle's planning processes. Modern trajectory prediction models strive to interpret complex patterns and dependencies from agent and map data. The Motion Transformer (MTR) architecture and subsequent work define the most accurate methods in common benchmarks such as the Waymo Open Motion Benchmark. The MTR model employs pre-generated static intention points as initial goal points for trajectory prediction. However, the static nature of these points frequently leads to misalignment with map data in specific traffic scenarios, resulting in unfeasible or unrealistic goal points. Our research addresses this limitation by integrating scene-specific dynamic intention points into the MTR model. This adaptation of the MTR model was trained and evaluated on the Waymo Open Motion Dataset. Our findings demonstrate that incorporating dynamic intention points has a significant positive impact on trajectory prediction accuracy, especially for predictions over long time horizons. Furthermore, we analyze the impact on ground truth trajectories which are not compliant with the map data or are illegal maneuvers.

Paper Structure

This paper contains 16 sections, 8 figures, 5 tables.

Figures (8)

  • Figure 1: Comparison of static (green) and dynamic (blue) intention points. The agent (red box) shows its past 1s and future 8s trajectory (red line).
  • Figure 2: Static intention point distributions for vehicles, pedestrians and cyclists. Agent's current position (green dot), intention points (orange stars), and for clearer visualization only 10% of the historical ground-truth trajectories (gray dotted lines). Taken from shi2022motion.
  • Figure 3: Edge cases in scenarios without validity checks: Heading Alignment (a), Proximity Limit (b), and 'Backwards Look' (c). The target agent is highlighted in red, with its actual past and future trajectory shown by a red line and the incorrectly assigned lane indicated by blue marked lane nodes.
  • Figure 4: A visual depiction of a generated road graph, marked in blue. The agent and its corresponding ground-truth trajectory are illustrated in red.
  • Figure 5: Intention points (blue dots) on the road graph (blue line) generated via K-Means Clustering, with the agent and its ground-truth trajectory shown in red.
  • ...and 3 more figures