TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

Mingyu Fan; Yi Liu; Hao Zhou; Deheng Qian; Mohammad Haziq Khan; Matthias Raetsch

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

Mingyu Fan, Yi Liu, Hao Zhou, Deheng Qian, Mohammad Haziq Khan, Matthias Raetsch

TL;DR

This work proposes TaPD (Temporal-adaptive Progressive Distillation), a unified plug-and-play framework for observation-adaptive trajectory forecasting under variable history lengths, and employs a decoupled pretrain-reconstruct-finetune protocol to preserve real-motion priors while adapting to backfilled inputs.

Abstract

Trajectory prediction is essential for autonomous driving, enabling vehicles to anticipate the motion of surrounding agents to support safe planning. However, most existing predictors assume fixed-length histories and suffer substantial performance degradation when observations are variable or extremely short in real-world settings (e.g., due to occlusion or a limited sensing range). We propose TaPD (Temporal-adaptive Progressive Distillation), a unified plug-and-play framework for observation-adaptive trajectory forecasting under variable history lengths. TaPD comprises two cooperative modules: an Observation-Adaptive Forecaster (OAF) for future prediction and a Temporal Backfilling Module (TBM) for explicit reconstruction of the past. OAF is built on progressive knowledge distillation (PKD), which transfers motion pattern knowledge from long-horizon "teachers" to short-horizon "students" via hierarchical feature regression, enabling short observations to recover richer motion context. We further introduce a cosine-annealed distillation weighting scheme to balance forecasting supervision and feature alignment, improving optimization stability and cross-length consistency. For extremely short histories where implicit alignment is insufficient, TBM backfills missing historical segments conditioned on scene evolution, producing context-rich trajectories that strengthen PKD and thereby improve OAF. We employ a decoupled pretrain-reconstruct-finetune protocol to preserve real-motion priors while adapting to backfilled inputs. Extensive experiments on Argoverse 1 and Argoverse 2 show that TaPD consistently outperforms strong baselines across all observation lengths, delivers especially large gains under very short inputs, and improves other predictors (e.g., HiVT) in a plug-and-play manner. Code will be available at https://github.com/zhouhao94/TaPD.

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

TL;DR

Abstract

Paper Structure (35 sections, 21 equations, 4 figures, 8 tables)

This paper contains 35 sections, 21 equations, 4 figures, 8 tables.

Introduction
Related Works
Representations and Architectures for Motion Forecasting
Learning Paradigms: Pretraining, Diffusion, and Robustness
Forecasting with Partial Observability and Variable Observation Lengths
Summary
TaPD: a plug-and-play framework
Problem Formulation
Observation-Adaptive Forecaster (OAF)
Encoder--decoder forecasting with cross-length parameter sharing.
Progressive knowledge distillation (PKD).
Temporal Backfilling Module (TBM)
Encoder with cross-length parameter sharing.
Decoder and temporal backfilling.
Role of TBM in the overall framework.
...and 20 more sections

Figures (4)

Figure 1: The OAF pipeline with parameter sharing and progressive knowledge distillation.
Figure 2: The TBM temporal backfilling module for unobserved past trajectories backfilling.
Figure 3: The overview of TaPD training strategy.
Figure 4: Qualitative comparison on the Argoverse 2 single-agent validation set under short observations (10 time steps). Each row corresponds to one scene, and the three columns show predictions from (a) DeMo_IT, (b) DeMo_OAF, and (c) our DeMo_TaPD, respectively, given the same input. The observed history is shown as a solid black segment. The ground-truth future trajectory is shown in solid red with an arrow. Green dashed arrows denote multiple predicted trajectories. For methods with backfilling, the reconstructed (unobserved) history is shown as a blue dashed segment, and the corresponding ground-truth unobserved history segment is shown in solid orange.

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

TL;DR

Abstract

TaPD: Temporal-adaptive Progressive Distillation for Observation-Adaptive Trajectory Forecasting in Autonomous Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (4)