ExpertAD: Enhancing Autonomous Driving Systems with Mixture of Experts
Haowen Jiang, Xinyu Huang, You Lu, Dingji Wang, Yuheng Cao, Chaofeng Sha, Bihuan Chen, Keyu Chen, Xin Peng
TL;DR
ExpertAD targets the persistent latency and interference challenges in end-to-end autonomous driving systems by introducing a Perception Adapter (PA) and a Mixture of Sparse Experts (MoSE) that span perception and prediction. The PA dynamically emphasizes task-critical BEV features, while MoSE uses eight specialized sparse-attention experts across Environmental, Ego State, and Navigation tasks, gated by a Router to activate only the most relevant experts. Through joint training with a switch-friendly loss, ExpertAD achieves notable gains in planning effectiveness and inference efficiency across multiple vision-only ADS baselines, and demonstrates strong generalization to unseen urban environments and useful qualitative case studies. The results support ExpertAD as a practical approach to improve safety and efficiency in autonomous driving by reducing collision rates and latency without sacrificing planning quality.
Abstract
Recent advancements in end-to-end autonomous driving systems (ADSs) underscore their potential for perception and planning capabilities. However, challenges remain. Complex driving scenarios contain rich semantic information, yet ambiguous or noisy semantics can compromise decision reliability, while interference between multiple driving tasks may hinder optimal planning. Furthermore, prolonged inference latency slows decision-making, increasing the risk of unsafe driving behaviors. To address these challenges, we propose ExpertAD, a novel framework that enhances the performance of ADS with Mixture of Experts (MoE) architecture. We introduce a Perception Adapter (PA) to amplify task-critical features, ensuring contextually relevant scene understanding, and a Mixture of Sparse Experts (MoSE) to minimize task interference during prediction, allowing for effective and efficient planning. Our experiments show that ExpertAD reduces average collision rates by up to 20% and inference latency by 25% compared to prior methods. We further evaluate its multi-skill planning capabilities in rare scenarios (e.g., accidents, yielding to emergency vehicles) and demonstrate strong generalization to unseen urban environments. Additionally, we present a case study that illustrates its decision-making process in complex driving scenarios.
