SSL-Interactions: Pretext Tasks for Interactive Trajectory Prediction
Prarthana Bhattacharyya, Chengjie Huang, Krzysztof Czarnecki
TL;DR
This work tackles interactive trajectory forecasting for multi-agent scenes by introducing SSL-Interactions, a self-supervised framework that decomposes joint dynamics into a scalable marginal predictor plus interaction-focused pretext tasks. Four interaction-aware tasksârange-gap, closest-distance, direction of movement, and type of interactionâare trained alongside the main forecast, using pseudo-labeled interacting pairs curated from the data. The approach yields consistent improvements over a state-of-the-art baseline, particularly in interactive scenarios, with up to 8% gains on proposed metrics like i-minFDE_6 and CAM_6, while maintaining competitive performance on non-interactive data. The study also contributes a practical data-curation method and new evaluation metrics tailored to interaction-rich scenes, advancing the practical deployment of motion forecasting in safety-critical autonomous driving applications.
Abstract
This paper addresses motion forecasting in multi-agent environments, pivotal for ensuring safety of autonomous vehicles. Traditional as well as recent data-driven marginal trajectory prediction methods struggle to properly learn non-linear agent-to-agent interactions. We present SSL-Interactions that proposes pretext tasks to enhance interaction modeling for trajectory prediction. We introduce four interaction-aware pretext tasks to encapsulate various aspects of agent interactions: range gap prediction, closest distance prediction, direction of movement prediction, and type of interaction prediction. We further propose an approach to curate interaction-heavy scenarios from datasets. This curated data has two advantages: it provides a stronger learning signal to the interaction model, and facilitates generation of pseudo-labels for interaction-centric pretext tasks. We also propose three new metrics specifically designed to evaluate predictions in interactive scenes. Our empirical evaluations indicate SSL-Interactions outperforms state-of-the-art motion forecasting methods quantitatively with up to 8% improvement, and qualitatively, for interaction-heavy scenarios.
