Table of Contents
Fetching ...

HiMAP: History-aware Map-occupancy Prediction with Fallback

Yiming Xu, Yi Yang, Hao Cheng, Monika Sester

TL;DR

This work presents HiMAP, a tracking-free, trajectory prediction framework that remains reliable under MOT failures, and achieves performance comparable to tracking-based methods while operating without IDs, and it substantially outperforms strong baselines in the no-tracking setting.

Abstract

Accurate motion forecasting is critical for autonomous driving, yet most predictors rely on multi-object tracking (MOT) with identity association, assuming that objects are correctly and continuously tracked. When tracking fails due to, e.g., occlusion, identity switches, or missed detections, prediction quality degrades and safety risks increase. We present \textbf{HiMAP}, a tracking-free, trajectory prediction framework that remains reliable under MOT failures. HiMAP converts past detections into spatiotemporally invariant historical occupancy maps and introduces a historical query module that conditions on the current agent state to iteratively retrieve agent-specific history from unlabeled occupancy representations. The retrieved history is summarized by a temporal map embedding and, together with the final query and map context, drives a DETR-style decoder to produce multi-modal future trajectories. This design lifts identity reliance, supports streaming inference via reusable encodings, and serves as a robust fallback when tracking is unavailable. On Argoverse~2, HiMAP achieves performance comparable to tracking-based methods while operating without IDs, and it substantially outperforms strong baselines in the no-tracking setting, yielding relative gains of 11\% in FDE, 12\% in ADE, and a 4\% reduction in MR over a fine-tuned QCNet. Beyond aggregate metrics, HiMAP delivers stable forecasts for all agents simultaneously without waiting for tracking to recover, highlighting its practical value for safety-critical autonomy. The code is available under: https://github.com/XuYiMing83/HiMAP.

HiMAP: History-aware Map-occupancy Prediction with Fallback

TL;DR

This work presents HiMAP, a tracking-free, trajectory prediction framework that remains reliable under MOT failures, and achieves performance comparable to tracking-based methods while operating without IDs, and it substantially outperforms strong baselines in the no-tracking setting.

Abstract

Accurate motion forecasting is critical for autonomous driving, yet most predictors rely on multi-object tracking (MOT) with identity association, assuming that objects are correctly and continuously tracked. When tracking fails due to, e.g., occlusion, identity switches, or missed detections, prediction quality degrades and safety risks increase. We present \textbf{HiMAP}, a tracking-free, trajectory prediction framework that remains reliable under MOT failures. HiMAP converts past detections into spatiotemporally invariant historical occupancy maps and introduces a historical query module that conditions on the current agent state to iteratively retrieve agent-specific history from unlabeled occupancy representations. The retrieved history is summarized by a temporal map embedding and, together with the final query and map context, drives a DETR-style decoder to produce multi-modal future trajectories. This design lifts identity reliance, supports streaming inference via reusable encodings, and serves as a robust fallback when tracking is unavailable. On Argoverse~2, HiMAP achieves performance comparable to tracking-based methods while operating without IDs, and it substantially outperforms strong baselines in the no-tracking setting, yielding relative gains of 11\% in FDE, 12\% in ADE, and a 4\% reduction in MR over a fine-tuned QCNet. Beyond aggregate metrics, HiMAP delivers stable forecasts for all agents simultaneously without waiting for tracking to recover, highlighting its practical value for safety-critical autonomy. The code is available under: https://github.com/XuYiMing83/HiMAP.
Paper Structure (26 sections, 8 equations, 4 figures, 5 tables)

This paper contains 26 sections, 8 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Comparison between tracking-based prediction and our tracking-free fallback. Tracking-based methods rely on stable identity association, but fail when tracking breaks. Our tracking-free module provides a complementary safety mechanism by reconstructing history from historical occupancy maps, ensuring reliable prediction even under tracking failures. Our method matches the current agent state to historical occupancy maps, implicitly recovering its past states without explicit IDs for robust trajectory prediction.
  • Figure 2: Overview of our HiMAP pipeline. The framework consists of four main components: Agent and Map Encoding, which embeds agent states and HD map elements in spatiotemporally invariant local frames; Historical Occupancy Map Encoder, which aggregates per-frame agent--lane interactions into occupancy representations without relying on tracking IDs; Historical Query Module, which initializes a history-aware query from the current agent state and iteratively attends to past occupancy maps to reconstruct agent-specific trajectories; Future Trajectory Decoder, a DETR-style query decoder that generates multi-modal predictions from the reconstructed history, final query, and map context. This design provides a robust fallback when tracking fails, enabling reliable forecasting directly from historical detections.
  • Figure 3: Comparison between HiMAP and QCNet on the Argoverse 2 validation set under different numbers of available tracking steps. HiMAP maintains fixed performance without requiring tracking, shown as horizontal dashed lines. QCNet gradually improves as more tracked history is available, surpassing HiMAP after 13--14 steps. The green curve indicates the average distance traveled per timestep.
  • Figure 4: Qualitative comparison of trajectory prediction results on the Argoverse 2 validation set. The blue box denotes the target agent. Blue lines show the agent’s past trajectory (not used during prediction), green lines represent the ground-truth future trajectory, and orange lines indicate predicted trajectories, with darker colors corresponding to higher predicted probabilities, and the colorbar is on the right.