Safety-Enhanced Autonomous Driving Using Interpretable Sensor Fusion Transformer
Hao Shao, Letian Wang, RuoBing Chen, Hongsheng Li, Yu Liu
TL;DR
The paper tackles safety in autonomous driving, focusing on long-tail rare events and interpretability of decisions. It proposes InterFuser, a one-stage interpretable sensor fusion Transformer that fuses multi-modal multi-view inputs and outputs intermediate representations such as the ego trajectory with $L=10$ waypoints and an object density map $M \in \mathbb{R}^{R\times R\times 7}$ plus traffic-rule signals. A safety controller uses these intermediate features to constrain actions within safe sets by computing maximum safe distances $s_1$, $s_2$ and solving a linear program for the desired speed, while forecasting other agents' motion with a tracker. Experiments on CARLA benchmarks show InterFuser achieves state-of-the-art driving performance and ranks first on the public leaderboard.
Abstract
Large-scale deployment of autonomous vehicles has been continually delayed due to safety concerns. On the one hand, comprehensive scene understanding is indispensable, a lack of which would result in vulnerability to rare but complex traffic situations, such as the sudden emergence of unknown objects. However, reasoning from a global context requires access to sensors of multiple types and adequate fusion of multi-modal sensor signals, which is difficult to achieve. On the other hand, the lack of interpretability in learning models also hampers the safety with unverifiable failure causes. In this paper, we propose a safety-enhanced autonomous driving framework, named Interpretable Sensor Fusion Transformer(InterFuser), to fully process and fuse information from multi-modal multi-view sensors for achieving comprehensive scene understanding and adversarial event detection. Besides, intermediate interpretable features are generated from our framework, which provide more semantics and are exploited to better constrain actions to be within the safe sets. We conducted extensive experiments on CARLA benchmarks, where our model outperforms prior methods, ranking the first on the public CARLA Leaderboard. Our code will be made available at https://github.com/opendilab/InterFuser
