Table of Contents
Fetching ...

Surprise Potential as a Measure of Interactivity in Driving Scenarios

Wenhao Ding, Sushant Veer, Karen Leung, Yulong Cao, Marco Pavone

TL;DR

The paper tackles the paucity of interactive driving data by proposing surprise potential SP as a metric derived from distribution shifts in multi agent trajectory predictions under counterfactual interventions. SP is defined as S(ξ) = $D(\mathcal{F}(ξ), \mathcal{F}\circ\mathcal{G}(ξ))$, decomposed into counterfactual generation G, future predictor F, and shift metric D, and evaluated via human preferences on nuScenes. An exhaustive design space exploration identifies Hist-prim with Wasserstein distance on scene and query centric predictors as the strongest performer, achieving a correlation above $0.82$ with human rewards. Downstream, curated interactive datasets improve planner safety metrics and can enhance learning through importance weighted upsampling, suggesting practical value for benchmarking and training in multi agent autonomous driving contexts.

Abstract

Validating the safety and performance of an autonomous vehicle (AV) requires benchmarking on real-world driving logs. However, typical driving logs contain mostly uneventful scenarios with minimal interactions between road users. Identifying interactive scenarios in real-world driving logs enables the curation of datasets that amplify critical signals and provide a more accurate assessment of an AV's performance. In this paper, we present a novel metric that identifies interactive scenarios by measuring an AV's surprise potential on others. First, we identify three dimensions of the design space to describe a family of surprise potential measures. Second, we exhaustively evaluate and compare different instantiations of the surprise potential measure within this design space on the nuScenes dataset. To determine how well a surprise potential measure correctly identifies an interactive scenario, we use a reward model learned from human preferences to assess alignment with human intuition. Our proposed surprise potential, arising from this exhaustive comparative study, achieves a correlation of more than 0.82 with the human-aligned reward function, outperforming existing approaches. Lastly, we validate motion planners on curated interactive scenarios to demonstrate downstream applications.

Surprise Potential as a Measure of Interactivity in Driving Scenarios

TL;DR

The paper tackles the paucity of interactive driving data by proposing surprise potential SP as a metric derived from distribution shifts in multi agent trajectory predictions under counterfactual interventions. SP is defined as S(ξ) = , decomposed into counterfactual generation G, future predictor F, and shift metric D, and evaluated via human preferences on nuScenes. An exhaustive design space exploration identifies Hist-prim with Wasserstein distance on scene and query centric predictors as the strongest performer, achieving a correlation above with human rewards. Downstream, curated interactive datasets improve planner safety metrics and can enhance learning through importance weighted upsampling, suggesting practical value for benchmarking and training in multi agent autonomous driving contexts.

Abstract

Validating the safety and performance of an autonomous vehicle (AV) requires benchmarking on real-world driving logs. However, typical driving logs contain mostly uneventful scenarios with minimal interactions between road users. Identifying interactive scenarios in real-world driving logs enables the curation of datasets that amplify critical signals and provide a more accurate assessment of an AV's performance. In this paper, we present a novel metric that identifies interactive scenarios by measuring an AV's surprise potential on others. First, we identify three dimensions of the design space to describe a family of surprise potential measures. Second, we exhaustively evaluate and compare different instantiations of the surprise potential measure within this design space on the nuScenes dataset. To determine how well a surprise potential measure correctly identifies an interactive scenario, we use a reward model learned from human preferences to assess alignment with human intuition. Our proposed surprise potential, arising from this exhaustive comparative study, achieves a correlation of more than 0.82 with the human-aligned reward function, outperforming existing approaches. Lastly, we validate motion planners on curated interactive scenarios to demonstrate downstream applications.

Paper Structure

This paper contains 30 sections, 5 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: The distribution of the interactive score obtained by surprise potential on the nuScenes dataset nuscenes. We notice that most scenarios in the dataset are not considered interactive.
  • Figure 2: The illustration of counterfactual generation methods considered in this paper.
  • Figure 3: The Spearman rank correlation of various metrics for evaluating interactivity, where higher values represent better agreement.
  • Figure 4: The area under the Receiver Operating Characteristic curve (AUC-ROC) for various metrics used to assess interactivity, with higher values indicating better performance.
  • Figure 5: Examples of future predictions using Hist-prim, where different colors represent different counterfactual scenarios. Dashed lines indicate the history primitives of the ego vehicle (Ego), while solid lines represent the multi-modal predictions for other agents. With varying history primitives, interactive agents (ITA) exhibit distinct future predictions, whereas non-interactive agents produce similar predictions.
  • ...and 3 more figures