Table of Contents
Fetching ...

PlanT 2.0: Exposing Biases and Structural Flaws in Closed-Loop Driving

Simon Gerstenecker, Andreas Geiger, Katrin Renz

TL;DR

The paper tackles robustness and generalization gaps in autonomous driving by systematically analyzing model failures on CARLA, arguing that current emphasis on benchmark performance masks biases and shortcut learning. PlanT 2.0 is a lightweight, object-centric planning transformer that extends PlanT with richer object inputs, an SD map BEV representation, expanded sensing range, and a decoupled output design; its input perturbability enables controlled failure analysis. The authors report state-of-the-art results on CARLA validation routes and benchmarks (e.g., $NDS=28.6$ on CARLA validation routes; strong performance on Bench2Drive and Longest6 v2) but identify systematic failures such as limited obstacle diversity, trajectory overfitting, and risk-prone shortcuts, underscoring data dependence. They advocate data-centric development with richer, more robust datasets and provide open-source code to facilitate ongoing bias/flaw analysis.

Abstract

Most recent work in autonomous driving has prioritized benchmark performance and methodological innovation over in-depth analysis of model failures, biases, and shortcut learning. This has led to incremental improvements without a deep understanding of the current failures. While it is straightforward to look at situations where the model fails, it is hard to understand the underlying reason. This motivates us to conduct a systematic study, where inputs to the model are perturbed and the predictions observed. We introduce PlanT 2.0, a lightweight, object-centric planning transformer designed for autonomous driving research in CARLA. The object-level representation enables controlled analysis, as the input can be easily perturbed (e.g., by changing the location or adding or removing certain objects), in contrast to sensor-based models. To tackle the scenarios newly introduced by the challenging CARLA Leaderboard 2.0, we introduce multiple upgrades to PlanT, achieving state-of-the-art performance on Longest6 v2, Bench2Drive, and the CARLA validation routes. Our analysis exposes insightful failures, such as a lack of scene understanding caused by low obstacle diversity, rigid expert behaviors leading to exploitable shortcuts, and overfitting to a fixed set of expert trajectories. Based on these findings, we argue for a shift toward data-centric development, with a focus on richer, more robust, and less biased datasets. We open-source our code and model at https://github.com/autonomousvision/plant2.

PlanT 2.0: Exposing Biases and Structural Flaws in Closed-Loop Driving

TL;DR

The paper tackles robustness and generalization gaps in autonomous driving by systematically analyzing model failures on CARLA, arguing that current emphasis on benchmark performance masks biases and shortcut learning. PlanT 2.0 is a lightweight, object-centric planning transformer that extends PlanT with richer object inputs, an SD map BEV representation, expanded sensing range, and a decoupled output design; its input perturbability enables controlled failure analysis. The authors report state-of-the-art results on CARLA validation routes and benchmarks (e.g., on CARLA validation routes; strong performance on Bench2Drive and Longest6 v2) but identify systematic failures such as limited obstacle diversity, trajectory overfitting, and risk-prone shortcuts, underscoring data dependence. They advocate data-centric development with richer, more robust datasets and provide open-source code to facilitate ongoing bias/flaw analysis.

Abstract

Most recent work in autonomous driving has prioritized benchmark performance and methodological innovation over in-depth analysis of model failures, biases, and shortcut learning. This has led to incremental improvements without a deep understanding of the current failures. While it is straightforward to look at situations where the model fails, it is hard to understand the underlying reason. This motivates us to conduct a systematic study, where inputs to the model are perturbed and the predictions observed. We introduce PlanT 2.0, a lightweight, object-centric planning transformer designed for autonomous driving research in CARLA. The object-level representation enables controlled analysis, as the input can be easily perturbed (e.g., by changing the location or adding or removing certain objects), in contrast to sensor-based models. To tackle the scenarios newly introduced by the challenging CARLA Leaderboard 2.0, we introduce multiple upgrades to PlanT, achieving state-of-the-art performance on Longest6 v2, Bench2Drive, and the CARLA validation routes. Our analysis exposes insightful failures, such as a lack of scene understanding caused by low obstacle diversity, rigid expert behaviors leading to exploitable shortcuts, and overfitting to a fixed set of expert trajectories. Based on these findings, we argue for a shift toward data-centric development, with a focus on richer, more robust, and less biased datasets. We open-source our code and model at https://github.com/autonomousvision/plant2.

Paper Structure

This paper contains 23 sections, 11 figures, 6 tables.

Figures (11)

  • Figure 1: Lack of environmental understanding. Incorrect reactions to different permutations of the construction obstacle. The first image shows the path avoiding an obstacle in the ego lane, while the obstacle is in the left lane. The second image shows the model adjusting the path for an obstacle next to the road, the third example shows the model failing to react to the obstacle without the presence of construction cones.
  • Figure 2: Trajectory generalization. Predicted path at different distances to a construction obstacle. On the left, the model plans the correct path. In the other two images, as the vehicle is moved closer to the obstacle, the model is unable to adjust the steepness of the lane transition, resulting in trajectories that cause collisions.
  • Figure 3: Proximity violations in planned trajectories. Three examples of side-swipe collisions in obstacle avoidance scenarios
  • Figure 4: Positional shortcuts. Predicted path and waypoints in a construction obstacle scenario with oncoming traffic at manually altered rotations of 0, 15 and 30 degrees. As the rotation increases, the model predicts higher driving speeds.
  • Figure 5: Predicted target speeds for different ego vehicle rotations across four obstacle scenarios with oncoming traffic preventing the overtaking maneuver.
  • ...and 6 more figures