Perception Without Vision for Trajectory Prediction: Ego Vehicle Dynamics as Scene Representation for Efficient Active Learning in Autonomous Driving
Ross Greer, Mohan Trivedi
TL;DR
The paper addresses data efficiency for autonomous driving trajectory prediction by leveraging ego-vehicle trajectory and dynamic state information as a low-cost proxy for scene awareness. It introduces a novelty-sensitive active learning framework that clusters trajectory-states and uses budget-aware sampling parameters ($α$ and $β$) to selectively annotate data without relying on model uncertainty. By defining a trajectory-state similarity metric and employing hierarchical clustering, the approach identifies novel and familiar data clusters to guide data acquisition, achieving consistent gains over random sampling on the nuScenes dataset and evidencing a data-typicality phase transition. The work demonstrates that beginning with typical data and gradually increasing novelty yields efficient learning, with potential extensions to object detection and path planning, thereby enabling safer, more data-efficient autonomous driving systems.
Abstract
This study investigates the use of trajectory and dynamic state information for efficient data curation in autonomous driving machine learning tasks. We propose methods for clustering trajectory-states and sampling strategies in an active learning framework, aiming to reduce annotation and data costs while maintaining model performance. Our approach leverages trajectory information to guide data selection, promoting diversity in the training data. We demonstrate the effectiveness of our methods on the trajectory prediction task using the nuScenes dataset, showing consistent performance gains over random sampling across different data pool sizes, and even reaching sub-baseline displacement errors at just 50% of the data cost. Our results suggest that sampling typical data initially helps overcome the ''cold start problem,'' while introducing novelty becomes more beneficial as the training pool size increases. By integrating trajectory-state-informed active learning, we demonstrate that more efficient and robust autonomous driving systems are possible and practical using low-cost data curation strategies.
