Analysis of a Modular Autonomous Driving Architecture: The Top Submission to CARLA Leaderboard 2.0 Challenge
Weize Zhang, Mohammed Elmahgiubi, Kasra Rezaee, Behzad Khamidehi, Hamidreza Mirkhani, Fazel Arasteh, Chunlin Li, Muhammad Ahsan Kaleem, Eduardo R. Corral-Soto, Dhruv Sharma, Tongtong Cao
TL;DR
This work presents Kyber-E2E's modular autonomous driving architecture for CARLA Leaderboard 2.0, combining sensing, localization, perception, tracking/prediction, and planning/control to win the map track. It leverages language-assisted perception and IRL-based tuning of the motion planner, trained on diverse open datasets (including inD) and supplemented by CARLA Town data, to achieve robust performance across complex traffic scenarios. The study provides a detailed ablation of module contributions, showing where resource allocation yields the most benefit and highlighting dependence on accurate perception for planning in high-stakes maneuvers. The results demonstrate the practicality of modular designs for rapid development and deployment, while noting current limitations in long-range perception and the potential gains from moving toward end-to-end approaches in future work.
Abstract
In this paper we present the architecture of the Kyber-E2E submission to the map track of CARLA Leaderboard 2.0 Autonomous Driving (AD) challenge 2023, which achieved first place. We employed a modular architecture for our solution consists of five main components: sensing, localization, perception, tracking/prediction, and planning/control. Our solution leverages state-of-the-art language-assisted perception models to help our planner perform more reliably in highly challenging traffic scenarios. We use open-source driving datasets in conjunction with Inverse Reinforcement Learning (IRL) to enhance the performance of our motion planner. We provide insight into our design choices and trade-offs made to achieve this solution. We also explore the impact of each component in the overall performance of our solution, with the intent of providing a guideline where allocation of resources can have the greatest impact.
