LoRD: Adapting Differentiable Driving Policies to Distribution Shifts
Christopher Diehl, Peter Karkus, Sushant Veer, Marco Pavone, Torsten Bertram
TL;DR
This paper tackles distribution shifts in self-driving vehicles by adapting differentiable, multi-component driving stacks (prediction, planning, control) using a Low-Rank Residual Decoder (LoRD) and multi-task fine-tuning. LoRD adds a rank-constrained residual to cost or action outputs, enabling parameter-efficient adaptation, and is framed within an energy-based perspective that links residuals to composed energies. Experiments on nuPlan and exiD show improved open-loop OOD performance and reduced forgetting in closed-loop, with LoRD delivering clear advantages for structured policies over baselines. The work also reveals a notable gap between open-loop and closed-loop evaluation and emphasizes cross-domain ID+OOD evaluation to avoid forgetting bias. Future directions include tighter integration of LoRD with the base network and extending adaptation to more distribution shifts (e.g., weather or geography) while addressing ethical considerations.
Abstract
Distribution shifts between operational domains can severely affect the performance of learned models in self-driving vehicles (SDVs). While this is a well-established problem, prior work has mostly explored naive solutions such as fine-tuning, focusing on the motion prediction task. In this work, we explore novel adaptation strategies for differentiable autonomy stacks consisting of prediction, planning, and control, perform evaluation in closed-loop, and investigate the often-overlooked issue of catastrophic forgetting. Specifically, we introduce two simple yet effective techniques: a low-rank residual decoder (LoRD) and multi-task fine-tuning. Through experiments across three models conducted on two real-world autonomous driving datasets (nuPlan, exiD), we demonstrate the effectiveness of our methods and highlight a significant performance gap between open-loop and closed-loop evaluation in prior approaches. Our approach improves forgetting by up to 23.33% and the closed-loop OOD driving score by 9.93% in comparison to standard fine-tuning.
