RAPiD: Real-time Deterministic Trajectory Planning via Diffusion Behavior Priors for Safe and Efficient Autonomous Driving
Ruturaj Reddy, Hrishav Bakul Barua, Junn Yong Loo, Thanh Thi Nguyen, Ganesh Krishnasamy
TL;DR
RAPiD tackles the latency and safety challenges of diffusion-based trajectory planning by distilling a pretrained diffusion planner into a deterministic policy. It employs Score Regularised Policy Optimization (SRPO) to regularize policy learning with the diffusion prior’s score, and trains a safety-focused critic via Implicit Q-Learning using the Predictive Driver Model (PDM) scorer. The approach yields an 8× speedup over diffusion baselines and achieves state-of-the-art generalization among learning-based planners on interPlan, while maintaining strong safety guarantees through PDM-based supervision. This work demonstrates a practical path for real-time autonomous driving deployment by combining the expressive power of diffusion models with the efficiency of deterministic policies. The RAPiD framework thus bridges diffusion model expressiveness and production-time latency constraints, enabling safer and faster decision-making in complex traffic scenarios.
Abstract
Diffusion-based trajectory planners have demonstrated strong capability for modeling the multimodal nature of human driving behavior, but their reliance on iterative stochastic sampling poses critical challenges for real-time, safety-critical deployment. In this work, we present RAPiD, a deterministic policy extraction framework that distills a pretrained diffusion-based planner into an efficient policy while eliminating diffusion sampling. Using score-regularized policy optimization, we leverage the score function of a pre-trained diffusion planner as a behavior prior to regularize policy learning. To promote safety and passenger comfort, the policy is optimized using a critic trained to imitate a predictive driver controller, providing dense, safety-focused supervision beyond conventional imitation learning. Evaluations demonstrate that RAPiD achieves competitive performance on closed-loop nuPlan scenarios with an 8x speedup over diffusion baselines, while achieving state-of-the-art generalization among learning-based planners on the interPlan benchmark. The official website of this work is: https://github.com/ruturajreddy/RAPiD.
