On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren, Ye Yuan, Guoren Wang
TL;DR
This work tackles the deployment bottleneck of large end-to-end motion planners in autonomous driving by introducing PlanKD, a knowledge distillation framework that preserves planning-relevant information while prioritizing safety. It combines an information bottleneck–based planning-relevant feature distillation with a safety-aware waypoint-attentive distillation that assigns adaptive weights to waypoints based on context and proximity to moving obstacles, formalized through $J_{IB}=\max_Z\sum_{i=1}^M I(Z,Y^i)-\beta I(Z,H)$ and a safety kernel-driven ranking loss. The two modules enable end-to-end training with a joint loss $\mathcal{L}=\mathcal{L}_w+\mathcal{L}_{w^*}-\mathcal{L}_{IB}+\alpha_z\mathcal{L}_z+\alpha_r\mathcal{L}_{rank}+\alpha_e\mathcal{L}_e$, resulting in compact planners that closely match teacher performance while achieving roughly a 50% reduction in inference time on Town05 benchmarks. Experiments demonstrate that PlanKD substantially improves the safety and efficiency of lightweight planners (e.g., InterFuser and TCP), offering a portable and robust solution for resource-limited autonomous driving deployments.”
Abstract
End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. However, the oversized neural networks render them impractical for deployment on resource-constrained systems, which unavoidably requires more computational time and resources during reference.To handle this, knowledge distillation offers a promising approach that compresses models by enabling a smaller student model to learn from a larger teacher model. Nevertheless, how to apply knowledge distillation to compress motion planners has not been explored so far. In this paper, we propose PlanKD, the first knowledge distillation framework tailored for compressing end-to-end motion planners. First, considering that driving scenes are inherently complex, often containing planning-irrelevant or even noisy information, transferring such information is not beneficial for the student planner. Thus, we design an information bottleneck based strategy to only distill planning-relevant information, rather than transfer all information indiscriminately. Second, different waypoints in an output planned trajectory may hold varying degrees of importance for motion planning, where a slight deviation in certain crucial waypoints might lead to a collision. Therefore, we devise a safety-aware waypoint-attentive distillation module that assigns adaptive weights to different waypoints based on the importance, to encourage the student to accurately mimic more crucial waypoints, thereby improving overall safety. Experiments demonstrate that our PlanKD can boost the performance of smaller planners by a large margin, and significantly reduce their reference time.
