Table of Contents
Fetching ...

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data

Tim Salzmann, Boris Ivanovic, Punarjay Chakravarty, Marco Pavone

TL;DR

Trajectron++ tackles the challenge of predicting safe, multimodal human trajectories in environments with rich context by leveraging a directed spatiotemporal graph and a CVAE with discrete latent variables to model multiple plausible futures. It integrates agent dynamics (including non-holonomic constraints) and heterogeneous data such as semantic maps, and can condition predictions on an ego-vehicle's planned motions. The approach delivers state-of-the-art results on standard pedestrian benchmarks and the nuScenes autonomous-driving dataset, with ablations demonstrating the value of dynamics constraints, map information, and ego-plan conditioning. This open, modular framework enables tighter integration with planning and control in real-time robotic systems.

Abstract

Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.

Trajectron++: Dynamically-Feasible Trajectory Forecasting With Heterogeneous Data

TL;DR

Trajectron++ tackles the challenge of predicting safe, multimodal human trajectories in environments with rich context by leveraging a directed spatiotemporal graph and a CVAE with discrete latent variables to model multiple plausible futures. It integrates agent dynamics (including non-holonomic constraints) and heterogeneous data such as semantic maps, and can condition predictions on an ego-vehicle's planned motions. The approach delivers state-of-the-art results on standard pedestrian benchmarks and the nuScenes autonomous-driving dataset, with ablations demonstrating the value of dynamics constraints, map information, and ego-plan conditioning. This open, modular framework enables tighter integration with planning and control in real-time robotic systems.

Abstract

Reasoning about human motion is an important prerequisite to safe and socially-aware robotic navigation. As a result, multi-agent behavior prediction has become a core component of modern human-robot interactive systems, such as self-driving cars. While there exist many methods for trajectory forecasting, most do not enforce dynamic constraints and do not account for environmental information (e.g., maps). Towards this end, we present Trajectron++, a modular, graph-structured recurrent model that forecasts the trajectories of a general number of diverse agents while incorporating agent dynamics and heterogeneous data (e.g., semantic maps). Trajectron++ is designed to be tightly integrated with robotic planning and control frameworks; for example, it can produce predictions that are optionally conditioned on ego-agent motion plans. We demonstrate its performance on several challenging real-world trajectory forecasting datasets, outperforming a wide array of state-of-the-art deterministic and generative methods.

Paper Structure

This paper contains 21 sections, 14 equations, 6 figures, 5 tables.

Figures (6)

  • Figure 1: Exemplary road scene depicting pedestrians crossing a road in front of a vehicle which may continue straight or turn right. The graph representation of the scene is shown on the ground, where each agent and their interactions are represented as nodes and edges, visualized as white circles and dashed black lines, respectively. Arrows depict potential future agent velocities, with colors representing different high-level future behavior modes.
  • Figure 2: Left: Our approach represents a scene as a directed spatiotemporal graph. Nodes and edges represent agents and their interactions, respectively. Right: The corresponding network architecture for Node 1.
  • Figure 3: [nuScenes] The same scene as forecast by three versions of Trajectron++. (a) The base model tends to under-shoot turns, and makes overly-confident predictions. (b) Our approach better captures position uncertainty with dynamics integration, producing well-calibrated probabilities. (c) The model is able to leverage the additional information that a map provides, yielding accurate predictions.
  • Figure 4: Left: ADE results of all methods per dataset, as well as their average performance. Boxplots are shown for all generative models since they produce distributions of trajectories. 2000 trajectories were sampled per model at each prediction timestep, with each sample’s ADE included in the boxplots. Our approach with dynamics integration is compared here, specifically its $z_\text{mode}$ output configuration. X markers indicate the mean ADE. Mean ADE from deterministic baselines are visualized as horizontal lines. Right: The same analysis for FDE.
  • Figure 5: Left: When only using trajectory data, Trajectron++ does not know of obstacles and makes predictions into walls (in red). Right: Encoding a local map of the agent's surroundings significantly reduces the frequency of obstacle-violating predictions.
  • ...and 1 more figures