Table of Contents
Fetching ...

What Matters to Enhance Traffic Rule Compliance of Imitation Learning for End-to-End Autonomous Driving

Hongkuan Zhou, Wei Cao, Aifen Sui, Zhenshan Bing

TL;DR

This paper proposes P-CSG, a penalty-based imitation learning approach with contrastive-based cross semantics generation sensor fusion technologies to increase the overall performance of end-to-end autonomous driving and conducts robustness evaluations against adversarial attacks.

Abstract

End-to-end autonomous driving, where the entire driving pipeline is replaced with a single neural network, has recently gained research attention because of its simpler structure and faster inference time. Despite this appealing approach largely reducing the complexity in the driving pipeline, it also leads to safety issues because the trained policy is not always compliant with the traffic rules. In this paper, we proposed P-CSG, a penalty-based imitation learning approach with contrastive-based cross semantics generation sensor fusion technologies to increase the overall performance of end-to-end autonomous driving. In this method, we introduce three penalties - red light, stop sign, and curvature speed penalty to make the agent more sensitive to traffic rules. The proposed cross semantics generation helps to align the shared information of different input modalities. We assessed our model's performance using the CARLA Leaderboard - Town 05 Long Benchmark and Longest6 Benchmark, achieving 8.5% and 2.0% driving score improvement compared to the baselines. Furthermore, we conducted robustness evaluations against adversarial attacks like FGSM and Dot attacks, revealing a substantial increase in robustness compared to other baseline models. More detailed information can be found at https://hk-zh.github.io/p-csg-plus.

What Matters to Enhance Traffic Rule Compliance of Imitation Learning for End-to-End Autonomous Driving

TL;DR

This paper proposes P-CSG, a penalty-based imitation learning approach with contrastive-based cross semantics generation sensor fusion technologies to increase the overall performance of end-to-end autonomous driving and conducts robustness evaluations against adversarial attacks.

Abstract

End-to-end autonomous driving, where the entire driving pipeline is replaced with a single neural network, has recently gained research attention because of its simpler structure and faster inference time. Despite this appealing approach largely reducing the complexity in the driving pipeline, it also leads to safety issues because the trained policy is not always compliant with the traffic rules. In this paper, we proposed P-CSG, a penalty-based imitation learning approach with contrastive-based cross semantics generation sensor fusion technologies to increase the overall performance of end-to-end autonomous driving. In this method, we introduce three penalties - red light, stop sign, and curvature speed penalty to make the agent more sensitive to traffic rules. The proposed cross semantics generation helps to align the shared information of different input modalities. We assessed our model's performance using the CARLA Leaderboard - Town 05 Long Benchmark and Longest6 Benchmark, achieving 8.5% and 2.0% driving score improvement compared to the baselines. Furthermore, we conducted robustness evaluations against adversarial attacks like FGSM and Dot attacks, revealing a substantial increase in robustness compared to other baseline models. More detailed information can be found at https://hk-zh.github.io/p-csg-plus.
Paper Structure (25 sections, 13 equations, 3 figures, 4 tables)

This paper contains 25 sections, 13 equations, 3 figures, 4 tables.

Figures (3)

  • Figure 1: (a) Common imitation learning approach could learn the wrong behavior which violates the traffic rule. (b) Our approach penalizes the predicted waypoints violating traffic rules to enhance the agent's traffic rule adherence.
  • Figure 2: An Overview of Our Penalty-based Imitation Learning with Cross Semantics Generation.
  • Figure 3: Qualitative Attack Results on P-CSG. (a) Original RGB input, (b) Dot Attack with nine trained dots, (c) FGSM Attack with $\epsilon = 0.01$. The subtle FGSM perturbation in (c) is hard to spot compared to (a), showcasing the method's ability to create changes that are invisible to the human eye but still mislead models.