Trajectory Forecasts in Unknown Environments Conditioned on Grid-Based Plans
Nachiket Deo, Mohan M. Trivedi
TL;DR
This work tackles trajectory forecasting in unknown environments by conditioning forecasts on plans sampled from a grid-based MaxEntropy Inverse Reinforcement Learning policy. It introduces P2T, a three-component pipeline: a convolutional reward model that produces transient path and terminal goal rewards on a coarse 2-D grid, a reformulated MaxEnt IRL policy that jointly infers goals and paths from these rewards, and an attention-based trajectory generator that maps sampled plans and motion history to continuous future trajectories, which are then clustered into K representative predictions. By jointly inferring goals and plans and using plan-conditioned trajectory generation, the approach yields multimodal, scene-constrained forecasts with improved precision and diversity. Empirical results on Stanford Drone Dataset (SDD) and NuScenes demonstrate state-of-the-art or competitive performance across key metrics, with notably lower off-road and off-yaw rates, and real-time inference suitable for on-board deployment.
Abstract
We address the problem of forecasting pedestrian and vehicle trajectories in unknown environments, conditioned on their past motion and scene structure. Trajectory forecasting is a challenging problem due to the large variation in scene structure and the multimodal distribution of future trajectories. Unlike prior approaches that directly learn one-to-many mappings from observed context to multiple future trajectories, we propose to condition trajectory forecasts on plans sampled from a grid based policy learned using maximum entropy inverse reinforcement learning (MaxEnt IRL). We reformulate MaxEnt IRL to allow the policy to jointly infer plausible agent goals, and paths to those goals on a coarse 2-D grid defined over the scene. We propose an attention based trajectory generator that generates continuous valued future trajectories conditioned on state sequences sampled from the MaxEnt policy. Quantitative and qualitative evaluation on the publicly available Stanford drone and NuScenes datasets shows that our model generates trajectories that are diverse, representing the multimodal predictive distribution, and precise, conforming to the underlying scene structure over long prediction horizons.
