Table of Contents
Fetching ...

LoRD: Adapting Differentiable Driving Policies to Distribution Shifts

Christopher Diehl, Peter Karkus, Sushant Veer, Marco Pavone, Torsten Bertram

TL;DR

This paper tackles distribution shifts in self-driving vehicles by adapting differentiable, multi-component driving stacks (prediction, planning, control) using a Low-Rank Residual Decoder (LoRD) and multi-task fine-tuning. LoRD adds a rank-constrained residual to cost or action outputs, enabling parameter-efficient adaptation, and is framed within an energy-based perspective that links residuals to composed energies. Experiments on nuPlan and exiD show improved open-loop OOD performance and reduced forgetting in closed-loop, with LoRD delivering clear advantages for structured policies over baselines. The work also reveals a notable gap between open-loop and closed-loop evaluation and emphasizes cross-domain ID+OOD evaluation to avoid forgetting bias. Future directions include tighter integration of LoRD with the base network and extending adaptation to more distribution shifts (e.g., weather or geography) while addressing ethical considerations.

Abstract

Distribution shifts between operational domains can severely affect the performance of learned models in self-driving vehicles (SDVs). While this is a well-established problem, prior work has mostly explored naive solutions such as fine-tuning, focusing on the motion prediction task. In this work, we explore novel adaptation strategies for differentiable autonomy stacks consisting of prediction, planning, and control, perform evaluation in closed-loop, and investigate the often-overlooked issue of catastrophic forgetting. Specifically, we introduce two simple yet effective techniques: a low-rank residual decoder (LoRD) and multi-task fine-tuning. Through experiments across three models conducted on two real-world autonomous driving datasets (nuPlan, exiD), we demonstrate the effectiveness of our methods and highlight a significant performance gap between open-loop and closed-loop evaluation in prior approaches. Our approach improves forgetting by up to 23.33% and the closed-loop OOD driving score by 9.93% in comparison to standard fine-tuning.

LoRD: Adapting Differentiable Driving Policies to Distribution Shifts

TL;DR

This paper tackles distribution shifts in self-driving vehicles by adapting differentiable, multi-component driving stacks (prediction, planning, control) using a Low-Rank Residual Decoder (LoRD) and multi-task fine-tuning. LoRD adds a rank-constrained residual to cost or action outputs, enabling parameter-efficient adaptation, and is framed within an energy-based perspective that links residuals to composed energies. Experiments on nuPlan and exiD show improved open-loop OOD performance and reduced forgetting in closed-loop, with LoRD delivering clear advantages for structured policies over baselines. The work also reveals a notable gap between open-loop and closed-loop evaluation and emphasizes cross-domain ID+OOD evaluation to avoid forgetting bias. Future directions include tighter integration of LoRD with the base network and extending adaptation to more distribution shifts (e.g., weather or geography) while addressing ethical considerations.

Abstract

Distribution shifts between operational domains can severely affect the performance of learned models in self-driving vehicles (SDVs). While this is a well-established problem, prior work has mostly explored naive solutions such as fine-tuning, focusing on the motion prediction task. In this work, we explore novel adaptation strategies for differentiable autonomy stacks consisting of prediction, planning, and control, perform evaluation in closed-loop, and investigate the often-overlooked issue of catastrophic forgetting. Specifically, we introduce two simple yet effective techniques: a low-rank residual decoder (LoRD) and multi-task fine-tuning. Through experiments across three models conducted on two real-world autonomous driving datasets (nuPlan, exiD), we demonstrate the effectiveness of our methods and highlight a significant performance gap between open-loop and closed-loop evaluation in prior approaches. Our approach improves forgetting by up to 23.33% and the closed-loop OOD driving score by 9.93% in comparison to standard fine-tuning.

Paper Structure

This paper contains 11 sections, 3 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: SDVs need to adapt to various distribution shifts, such as traffic regulations, social norms, traffic density, and weather conditions.
  • Figure 2: Overview of this work's contributions. Fine-tuning data: Mult-task fine-tuning with OOD and ID data. Architecture: LoRD predicts action or cost residuals. Evaluation: Open-loop and closed-loop evaluation in both domains (ID, OOD). $\mathbf{a}$: action, $\mathbf{o}$: sequence of observations, $\mathbf{E}_\mathbf{w}$: cost function, $\mathbf{w}$: cost parameters.
  • Figure 3: Dataset statistics of the nuPlan ID (Boston, Pittsburgh) and OOD (Singapore) domain. Spatial distribution of distances to other agents using kernel density estimates (a,b). Distributions of the Euclidean distances to other agents (c), velocities (d), and number of agents (e).
  • Figure 4: Qualitative results (closed-loop control). The SDV (white) using the FT baseline (left) causes a collisions with another vehicle (green). The FT + LoRD (closed-loop regularization, right) model brakes and avoids the collision evident by the length of the cyan trajectory.
  • Figure 5: Ablation study on the amount of added ID data during multi-task fine-tuning. 0% denotes standard fine-tuning. 100% adds the same amount of ID and OOD data.