Table of Contents
Fetching ...

Can Vehicle Motion Planning Generalize to Realistic Long-tail Scenarios?

Marcel Hallgarten, Julian Zapata, Martin Stoll, Katrin Renz, Andreas Zell

TL;DR

The paper targets the challenge of generalizing vehicle motion planning to rare, long-tail driving scenarios. It introduces interPlan, a closed-loop benchmark built on augmented nuPlan data to stress-test interactive and diverse scenarios, and systematically evaluates rule-based, learning-based, and LLM-based planners. The findings show current state-of-the-art methods struggle with interPlan's long-tail cases, with rule-based approaches sometimes outperforming learning-based ones, while a novel hybrid LLM-behavior plus rule-based motion planner achieves a new state-of-the-art. The work highlights the potential of multimodal foundation models for driving but also underscores gaps in traffic understanding and the need for richer scenario generation for robust generalization.

Abstract

Real-world autonomous driving systems must make safe decisions in the face of rare and diverse traffic scenarios. Current state-of-the-art planners are mostly evaluated on real-world datasets like nuScenes (open-loop) or nuPlan (closed-loop). In particular, nuPlan seems to be an expressive evaluation method since it is based on real-world data and closed-loop, yet it mostly covers basic driving scenarios. This makes it difficult to judge a planner's capabilities to generalize to rarely-seen situations. Therefore, we propose a novel closed-loop benchmark interPlan containing several edge cases and challenging driving scenarios. We assess existing state-of-the-art planners on our benchmark and show that neither rule-based nor learning-based planners can safely navigate the interPlan scenarios. A recently evolving direction is the usage of foundation models like large language models (LLM) to handle generalization. We evaluate an LLM-only planner and introduce a novel hybrid planner that combines an LLM-based behavior planner with a rule-based motion planner that achieves state-of-the-art performance on our benchmark.

Can Vehicle Motion Planning Generalize to Realistic Long-tail Scenarios?

TL;DR

The paper targets the challenge of generalizing vehicle motion planning to rare, long-tail driving scenarios. It introduces interPlan, a closed-loop benchmark built on augmented nuPlan data to stress-test interactive and diverse scenarios, and systematically evaluates rule-based, learning-based, and LLM-based planners. The findings show current state-of-the-art methods struggle with interPlan's long-tail cases, with rule-based approaches sometimes outperforming learning-based ones, while a novel hybrid LLM-behavior plus rule-based motion planner achieves a new state-of-the-art. The work highlights the potential of multimodal foundation models for driving but also underscores gaps in traffic understanding and the need for richer scenario generation for robust generalization.

Abstract

Real-world autonomous driving systems must make safe decisions in the face of rare and diverse traffic scenarios. Current state-of-the-art planners are mostly evaluated on real-world datasets like nuScenes (open-loop) or nuPlan (closed-loop). In particular, nuPlan seems to be an expressive evaluation method since it is based on real-world data and closed-loop, yet it mostly covers basic driving scenarios. This makes it difficult to judge a planner's capabilities to generalize to rarely-seen situations. Therefore, we propose a novel closed-loop benchmark interPlan containing several edge cases and challenging driving scenarios. We assess existing state-of-the-art planners on our benchmark and show that neither rule-based nor learning-based planners can safely navigate the interPlan scenarios. A recently evolving direction is the usage of foundation models like large language models (LLM) to handle generalization. We evaluate an LLM-only planner and introduce a novel hybrid planner that combines an LLM-based behavior planner with a rule-based motion planner that achieves state-of-the-art performance on our benchmark.
Paper Structure (19 sections, 2 figures, 4 tables)

This paper contains 19 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: interPlan scenario types. The ego vehicle and its navigation route are shown in orange and purple, surrounding vehicles and pedestrians are blue and green respectively. Traffic cones and stop lines are depicted in red.
  • Figure 2: Qualitative results. Fig. (a), (b), (c) show successfully avoiding an accident site, waiting for jaywalkers, and executing a lane change surrounded by conservative agents. Failure cases are stopping before the accident site (e), narrowly avoiding the pedestrians (f), and colliding with an assertive vehicle approaching quickly from behind (g). Fig. (d) and (h) show failures of learning-based planners, such as suddenly stopping (d) or causing a collision (h). The planned trajectory is depicted in orange, the navigation route in purple.