Table of Contents
Fetching ...

Risk-Aware Reinforcement Learning for Autonomous Driving: Improving Safety When Driving through Intersection

Bo Leng, Ran Yu, Wei Han, Lu Xiong, Zhuoren Li, Hailong Huang

TL;DR

This work tackles safety-critical autonomous driving through intersections by introducing a risk-aware reinforcement learning framework that couples safe critics with a Lagrangian relaxation-based action projection into a feasible safe region and an MMAM-enhanced actor-critic to handle dynamic traffic. The method comprises approximate safe action generation, safety iterative correction, and a permutation-robust attention-based network to improve scene understanding and decision timing. Empirical results across left-turn, go-straight, and right-turn tasks show reduced collision rates and improved efficiency versus strong baselines, with ablations confirming the value of risk-awareness and MMAM. The approach offers a practical pathway to safer, more reliable autonomous intersection navigation, with potential extensions to mixed-traffic environments and offline dataset deployment.

Abstract

Applying reinforcement learning to autonomous driving has garnered widespread attention. However, classical reinforcement learning methods optimize policies by maximizing expected rewards but lack sufficient safety considerations, often putting agents in hazardous situations. This paper proposes a risk-aware reinforcement learning approach for autonomous driving to improve the safety performance when crossing the intersection. Safe critics are constructed to evaluate driving risk and work in conjunction with the reward critic to update the actor. Based on this, a Lagrangian relaxation method and cyclic gradient iteration are combined to project actions into a feasible safe region. Furthermore, a Multi-hop and Multi-layer perception (MLP) mixed Attention Mechanism (MMAM) is incorporated into the actor-critic network, enabling the policy to adapt to dynamic traffic and overcome permutation sensitivity challenges. This allows the policy to focus more effectively on surrounding potential risks while enhancing the identification of passing opportunities. Simulation tests are conducted on different tasks at unsignalized intersections. The results show that the proposed approach effectively reduces collision rates and improves crossing efficiency in comparison to baseline algorithms. Additionally, our ablation experiments demonstrate the benefits of incorporating risk-awareness and MMAM into RL.

Risk-Aware Reinforcement Learning for Autonomous Driving: Improving Safety When Driving through Intersection

TL;DR

This work tackles safety-critical autonomous driving through intersections by introducing a risk-aware reinforcement learning framework that couples safe critics with a Lagrangian relaxation-based action projection into a feasible safe region and an MMAM-enhanced actor-critic to handle dynamic traffic. The method comprises approximate safe action generation, safety iterative correction, and a permutation-robust attention-based network to improve scene understanding and decision timing. Empirical results across left-turn, go-straight, and right-turn tasks show reduced collision rates and improved efficiency versus strong baselines, with ablations confirming the value of risk-awareness and MMAM. The approach offers a practical pathway to safer, more reliable autonomous intersection navigation, with potential extensions to mixed-traffic environments and offline dataset deployment.

Abstract

Applying reinforcement learning to autonomous driving has garnered widespread attention. However, classical reinforcement learning methods optimize policies by maximizing expected rewards but lack sufficient safety considerations, often putting agents in hazardous situations. This paper proposes a risk-aware reinforcement learning approach for autonomous driving to improve the safety performance when crossing the intersection. Safe critics are constructed to evaluate driving risk and work in conjunction with the reward critic to update the actor. Based on this, a Lagrangian relaxation method and cyclic gradient iteration are combined to project actions into a feasible safe region. Furthermore, a Multi-hop and Multi-layer perception (MLP) mixed Attention Mechanism (MMAM) is incorporated into the actor-critic network, enabling the policy to adapt to dynamic traffic and overcome permutation sensitivity challenges. This allows the policy to focus more effectively on surrounding potential risks while enhancing the identification of passing opportunities. Simulation tests are conducted on different tasks at unsignalized intersections. The results show that the proposed approach effectively reduces collision rates and improves crossing efficiency in comparison to baseline algorithms. Additionally, our ablation experiments demonstrate the benefits of incorporating risk-awareness and MMAM into RL.

Paper Structure

This paper contains 30 sections, 23 equations, 10 figures, 3 tables, 2 algorithms.

Figures (10)

  • Figure 1: Diagram of feasible region $\mathcal{S}_{FR}$ and unfeasible region $\mathcal{S}_{UR}$.
  • Figure 2: Schematic of proposed framework. MEA and MSA stand for multi-head ego-attention and multi-head self-attention, respectively. $f(a_k)=\frac{\eta}{\mathcal{N}}_k\nabla_{a_k}\mathcal{L}_{\mathrm{soft}}(a_k)$.
  • Figure 3: Multi-hop and MLP-mixed Attention Mechanism (MMAM).
  • Figure 4: Driving tasks and main conflicts at unsignalized intersection. (a) LT task, EV primarily encounters conflicts with oncoming traffic and some crossing traffic. (b) GS task with mixed traffic flow. (c) RT task with crossing traffic, EV needs to perform a right merge.
  • Figure 5: Design of the scenario. $d_x^i,d_y^i$ are the relative distances of the EV to the target point along the x-axis and y-axis. $i =1,2$ represents the indices of reference line1 and reference line2.
  • ...and 5 more figures