Table of Contents
Fetching ...

On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving

Kaituo Feng, Changsheng Li, Dongchun Ren, Ye Yuan, Guoren Wang

TL;DR

This work tackles the deployment bottleneck of large end-to-end motion planners in autonomous driving by introducing PlanKD, a knowledge distillation framework that preserves planning-relevant information while prioritizing safety. It combines an information bottleneck–based planning-relevant feature distillation with a safety-aware waypoint-attentive distillation that assigns adaptive weights to waypoints based on context and proximity to moving obstacles, formalized through $J_{IB}=\max_Z\sum_{i=1}^M I(Z,Y^i)-\beta I(Z,H)$ and a safety kernel-driven ranking loss. The two modules enable end-to-end training with a joint loss $\mathcal{L}=\mathcal{L}_w+\mathcal{L}_{w^*}-\mathcal{L}_{IB}+\alpha_z\mathcal{L}_z+\alpha_r\mathcal{L}_{rank}+\alpha_e\mathcal{L}_e$, resulting in compact planners that closely match teacher performance while achieving roughly a 50% reduction in inference time on Town05 benchmarks. Experiments demonstrate that PlanKD substantially improves the safety and efficiency of lightweight planners (e.g., InterFuser and TCP), offering a portable and robust solution for resource-limited autonomous driving deployments.”

Abstract

End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. However, the oversized neural networks render them impractical for deployment on resource-constrained systems, which unavoidably requires more computational time and resources during reference.To handle this, knowledge distillation offers a promising approach that compresses models by enabling a smaller student model to learn from a larger teacher model. Nevertheless, how to apply knowledge distillation to compress motion planners has not been explored so far. In this paper, we propose PlanKD, the first knowledge distillation framework tailored for compressing end-to-end motion planners. First, considering that driving scenes are inherently complex, often containing planning-irrelevant or even noisy information, transferring such information is not beneficial for the student planner. Thus, we design an information bottleneck based strategy to only distill planning-relevant information, rather than transfer all information indiscriminately. Second, different waypoints in an output planned trajectory may hold varying degrees of importance for motion planning, where a slight deviation in certain crucial waypoints might lead to a collision. Therefore, we devise a safety-aware waypoint-attentive distillation module that assigns adaptive weights to different waypoints based on the importance, to encourage the student to accurately mimic more crucial waypoints, thereby improving overall safety. Experiments demonstrate that our PlanKD can boost the performance of smaller planners by a large margin, and significantly reduce their reference time.

On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving

TL;DR

This work tackles the deployment bottleneck of large end-to-end motion planners in autonomous driving by introducing PlanKD, a knowledge distillation framework that preserves planning-relevant information while prioritizing safety. It combines an information bottleneck–based planning-relevant feature distillation with a safety-aware waypoint-attentive distillation that assigns adaptive weights to waypoints based on context and proximity to moving obstacles, formalized through and a safety kernel-driven ranking loss. The two modules enable end-to-end training with a joint loss , resulting in compact planners that closely match teacher performance while achieving roughly a 50% reduction in inference time on Town05 benchmarks. Experiments demonstrate that PlanKD substantially improves the safety and efficiency of lightweight planners (e.g., InterFuser and TCP), offering a portable and robust solution for resource-limited autonomous driving deployments.”

Abstract

End-to-end motion planning models equipped with deep neural networks have shown great potential for enabling full autonomous driving. However, the oversized neural networks render them impractical for deployment on resource-constrained systems, which unavoidably requires more computational time and resources during reference.To handle this, knowledge distillation offers a promising approach that compresses models by enabling a smaller student model to learn from a larger teacher model. Nevertheless, how to apply knowledge distillation to compress motion planners has not been explored so far. In this paper, we propose PlanKD, the first knowledge distillation framework tailored for compressing end-to-end motion planners. First, considering that driving scenes are inherently complex, often containing planning-irrelevant or even noisy information, transferring such information is not beneficial for the student planner. Thus, we design an information bottleneck based strategy to only distill planning-relevant information, rather than transfer all information indiscriminately. Second, different waypoints in an output planned trajectory may hold varying degrees of importance for motion planning, where a slight deviation in certain crucial waypoints might lead to a collision. Therefore, we devise a safety-aware waypoint-attentive distillation module that assigns adaptive weights to different waypoints based on the importance, to encourage the student to accurately mimic more crucial waypoints, thereby improving overall safety. Experiments demonstrate that our PlanKD can boost the performance of smaller planners by a large margin, and significantly reduce their reference time.
Paper Structure (28 sections, 9 equations, 4 figures, 7 tables)

This paper contains 28 sections, 9 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: An illustration for the performance degradation of InterFuser shao2023safety on Town05 Long Benchmark prakash2021multi as the number of parameters decreases. By leveraging our PlanKD, the performance of compact motion planners can be enhanced, and the inference time can be significantly lowered. The inference time is evaluated on GeForce RTX 3090 GPU in a server. Best viewed in color.
  • Figure 2: An illustration of our PlanKD framework. PlanKD consists of two modules: a planning-relevant feature distillation module distilling planning-relevant features from intermediate feature maps via information bottleneck (IB); a safety-aware waypoint-attentive distillation module that dynamically determines crucial waypoints and distills knowledge from them for overall safety.
  • Figure 3: Visualizations of safety-aware attention weights under different driving scenarios. The green block denotes the ego-vehicle and the yellow blocks represent other road users (e.g. vehicles, bicycles). The redder a waypoint is, the higher attention weight it has.
  • Figure 4: Visualizations of the intermediate feature maps of InterFuser. The redder regions represent higher activation values. The first row is the input image of the front camera. The second row is the corresponding attention map generated by AT zagoruyko2016paying. The third row is the corresponding Grad-CAM selvaraju2017grad visualization guided by the gradient of the planning states in the information bottleneck.