Table of Contents
Fetching ...

RAD-LAD: Rule and Language Grounded Autonomous Driving in Real-Time

Anurag Ghosh, Srinivasa Narasimhan, Manmohan Chandraker, Francesco Pittaluga

Abstract

We present LAD, a real-time language--action planner with an interruptible architecture that produces a motion plan in a single forward pass (~20 Hz) or generates textual reasoning alongside a motion plan (~10 Hz). LAD is fast enough for real-time closed-loop deployment, achieving ~3x lower latency than prior driving language models while setting a new learning-based state of the art on nuPlan Test14-Hard and InterPlan. We also introduce RAD, a rule-based planner designed to address structural limitations of PDM-Closed. RAD achieves state-of-the-art performance among rule-based planners on nuPlan Test14-Hard and InterPlan. Finally, we show that combining RAD and LAD enables hybrid planning that captures the strengths of both approaches. This hybrid system demonstrates that rules and learning provide complementary capabilities: rules support reliable maneuvering, while language enables adaptive and explainable decision-making.

RAD-LAD: Rule and Language Grounded Autonomous Driving in Real-Time

Abstract

We present LAD, a real-time language--action planner with an interruptible architecture that produces a motion plan in a single forward pass (~20 Hz) or generates textual reasoning alongside a motion plan (~10 Hz). LAD is fast enough for real-time closed-loop deployment, achieving ~3x lower latency than prior driving language models while setting a new learning-based state of the art on nuPlan Test14-Hard and InterPlan. We also introduce RAD, a rule-based planner designed to address structural limitations of PDM-Closed. RAD achieves state-of-the-art performance among rule-based planners on nuPlan Test14-Hard and InterPlan. Finally, we show that combining RAD and LAD enables hybrid planning that captures the strengths of both approaches. This hybrid system demonstrates that rules and learning provide complementary capabilities: rules support reliable maneuvering, while language enables adaptive and explainable decision-making.

Paper Structure

This paper contains 58 sections, 15 equations, 11 figures, 8 tables.

Figures (11)

  • Figure 1: Autonomous driving requires rule-following and semantic understanding.(Top row) A left turn at a pickup/dropoff zone: the ego vehicle (red, with planned trajectory) must navigate around vehicles blocking the lane. (Bottom row) A right turn through a crowded intersection: dense traffic from multiple directions, pedestrian crossings, and ambiguous right-of-way require reasoning beyond simple trajectory optimization. Text overlays show LAD's real-time situational understanding. The key insight is many scenarios labeled "hard" require only better lane-changing which our rule-based planner RAD can handle. Semantic difficulty demands language-grounded reasoning, e.g. negotiation of ambiguous traffic which LAD handles by generating both motion plans and interpretable explanations at $\sim$10Hz, enabling real-time deployment.
  • Figure 2: LAD Architecture. We encode scene context, adapt it into the language model's manifold, and insert it as pseudo-tokens within the prompt. The decoder produces natural-language reasoning and a motion plan from the hidden state at <|plan|>.
  • Figure 3: Interruptible Anytime Inference.LAD produces a valid plan when interrupted, generating reasoning tokens if budget permits.
  • Figure 4: PDM-Closed's dauner2023pdmclosed static proposal paths become blocked by obstacles with no recovery. RAD topologically replans at every timestep and augments the route with adjacent-lane centerlines.
  • Figure 5: RAD combines goal-directed optimization with trajectory proposal augmentation, adding feasible alternative trajectories and favoring trajectories that make progress toward the goal.
  • ...and 6 more figures