Table of Contents
Fetching ...

SECRM-2D: RL-Based Efficient and Comfortable Route-Following Autonomous Driving with Analytic Safety Guarantees

Tianyu Shi, Ilia Smirnov, Omar ElSamadisy, Baher Abdulhai

TL;DR

This paper proposes SECRM‐2D (the safe, efficient and comfortable RL‐based driving model with lane‐changing), an RL autonomous driving controller that balances optimization of efficiency and comfort and follows a fixed route, while being subject to hard analytic safety constraints.

Abstract

Over the last decade, there has been increasing interest in autonomous driving systems. Reinforcement Learning (RL) shows great promise for training autonomous driving controllers, being able to directly optimize a combination of criteria such as efficiency comfort, and stability. However, RL- based controllers typically offer no safety guarantees, making their readiness for real deployment questionable. In this paper, we propose SECRM-2D (the Safe, Efficient and Comfortable RL- based driving Model with Lane-Changing), an RL autonomous driving controller (both longitudinal and lateral) that balances optimization of efficiency and comfort and follows a fixed route, while being subject to hard analytic safety constraints. The aforementioned safety constraints are derived from the criterion that the follower vehicle must have sufficient headway to be able to avoid a crash if the leader vehicle brakes suddenly. We evaluate SECRM-2D against several learning and non-learning baselines in simulated test scenarios, including freeway driving, exiting, merging, and emergency braking. Our results confirm that representative previously-published RL AV controllers may crash in both training and testing, even if they are optimizing a safety objective. By contrast, our controller SECRM-2D is successful in avoiding crashes during both training and testing, improves over the baselines in measures of efficiency and comfort, and is more faithful in following the prescribed route. In addition, we achieve a good theoretical understanding of the longitudinal steady-state of a collection of SECRM-2D vehicles.

SECRM-2D: RL-Based Efficient and Comfortable Route-Following Autonomous Driving with Analytic Safety Guarantees

TL;DR

This paper proposes SECRM‐2D (the safe, efficient and comfortable RL‐based driving model with lane‐changing), an RL autonomous driving controller that balances optimization of efficiency and comfort and follows a fixed route, while being subject to hard analytic safety constraints.

Abstract

Over the last decade, there has been increasing interest in autonomous driving systems. Reinforcement Learning (RL) shows great promise for training autonomous driving controllers, being able to directly optimize a combination of criteria such as efficiency comfort, and stability. However, RL- based controllers typically offer no safety guarantees, making their readiness for real deployment questionable. In this paper, we propose SECRM-2D (the Safe, Efficient and Comfortable RL- based driving Model with Lane-Changing), an RL autonomous driving controller (both longitudinal and lateral) that balances optimization of efficiency and comfort and follows a fixed route, while being subject to hard analytic safety constraints. The aforementioned safety constraints are derived from the criterion that the follower vehicle must have sufficient headway to be able to avoid a crash if the leader vehicle brakes suddenly. We evaluate SECRM-2D against several learning and non-learning baselines in simulated test scenarios, including freeway driving, exiting, merging, and emergency braking. Our results confirm that representative previously-published RL AV controllers may crash in both training and testing, even if they are optimizing a safety objective. By contrast, our controller SECRM-2D is successful in avoiding crashes during both training and testing, improves over the baselines in measures of efficiency and comfort, and is more faithful in following the prescribed route. In addition, we achieve a good theoretical understanding of the longitudinal steady-state of a collection of SECRM-2D vehicles.
Paper Structure (30 sections, 35 equations, 7 figures, 6 tables)

This paper contains 30 sections, 35 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: The mandatory lane-change penalty $R^{}_{\mathrm{route}}$ accrued by a vehicle in a freeway off-ramp scenario, based on the vehicle's current position and route. For clarity of the illustration, the values are capped below at $-0.5$. Left: The vehicle route exits the freeway at the off-ramp. Right: The vehicle route remains on the freeway.
  • Figure 2: Geometry of the loop network. The effect of curvature on vehicle speed has been disabled.
  • Figure 3: Geometry of the interchange of Queen Elizabeth Way (QEW) and Erin Mills Parkway / Southdown Road (SUMO).
  • Figure 4: Geometry of the interchange of Queen Elizabeth Way (QEW) and Erin Mills Parkway / Southdown Road (Google Maps)
  • Figure 5: Convergence of the Efficiency (top) and Comfort (bottom) rewards during training.Initial few epochs, the agent adopts a conservative policy, often remaining stationary, resulting in low efficiency scores. Over time, however, it learns to balance safety and efficiency, achieving high levels of driving comfort.
  • ...and 2 more figures

Theorems & Definitions (3)

  • proof
  • proof
  • Remark