Table of Contents
Fetching ...

Enhancing Safety for Autonomous Agents in Partly Concealed Urban Traffic Environments Through Representation-Based Shielding

Pierre Haritz, David Wanke, Thomas Liebig

TL;DR

The paper addresses safe navigation through partly concealed urban intersections using reinforcement learning. It couples an ego-centric invariant state representation (IER+) with a post-posed safety shield for Deep Q-Learning, leveraging time-to-occupancy $\mathrm{TTO}$ and time-to-vacancy $\mathrm{TTV}$ features and LTL-based safety constraints. The approach yields notable safety gains and robust generalization to unseen road maps while maintaining competitive travel speed and energy efficiency, demonstrated in a dedicated PyAD-RL simulator. This work provides a scalable, reproducible framework for safer autonomous urban navigation under partial observability and occlusions.

Abstract

Navigating unsignalized intersections in urban environments poses a complex challenge for self-driving vehicles, where issues such as view obstructions, unpredictable pedestrian crossings, and diverse traffic participants demand a great focus on crash prevention. In this paper, we propose a novel state representation for Reinforcement Learning (RL) agents centered around the information perceivable by an autonomous agent, enabling the safe navigation of previously uncharted road maps. Our approach surpasses several baseline models by a sig nificant margin in terms of safety and energy consumption metrics. These improvements are achieved while maintaining a competitive average travel speed. Our findings pave the way for more robust and reliable autonomous navigation strategies, promising safer and more efficient urban traffic environments.

Enhancing Safety for Autonomous Agents in Partly Concealed Urban Traffic Environments Through Representation-Based Shielding

TL;DR

The paper addresses safe navigation through partly concealed urban intersections using reinforcement learning. It couples an ego-centric invariant state representation (IER+) with a post-posed safety shield for Deep Q-Learning, leveraging time-to-occupancy and time-to-vacancy features and LTL-based safety constraints. The approach yields notable safety gains and robust generalization to unseen road maps while maintaining competitive travel speed and energy efficiency, demonstrated in a dedicated PyAD-RL simulator. This work provides a scalable, reproducible framework for safer autonomous urban navigation under partial observability and occlusions.

Abstract

Navigating unsignalized intersections in urban environments poses a complex challenge for self-driving vehicles, where issues such as view obstructions, unpredictable pedestrian crossings, and diverse traffic participants demand a great focus on crash prevention. In this paper, we propose a novel state representation for Reinforcement Learning (RL) agents centered around the information perceivable by an autonomous agent, enabling the safe navigation of previously uncharted road maps. Our approach surpasses several baseline models by a sig nificant margin in terms of safety and energy consumption metrics. These improvements are achieved while maintaining a competitive average travel speed. Our findings pave the way for more robust and reliable autonomous navigation strategies, promising safer and more efficient urban traffic environments.
Paper Structure (12 sections, 13 equations, 6 figures, 2 tables)

This paper contains 12 sections, 13 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Illustration of the state representation with $TTO$ and $TTV$ as colored gradient from red ($t\rightarrow 0$) to green ($t\rightarrow t_{max}$). We color the indicator function $\mathbbm{1}_{intersection}$ as green (neither beginning nor end of the intersection), yellow (intersection end), or red (intersection start). The priority indicator $\varphi_{priority,other}$ is shown as green (the agent has the right of way) or rot (another traffic participant has the right of way).
  • Figure 2: Illustration of our Shielding approach in PyAD-RL. The agent (red) gets supported by a shield that triggers if certain conditions are met.
  • Figure 3: Our IER+-Shielding based RL loop. The selected action gets checked by the shield condition and replaced if necessary. Furthermore, using the shield gets penalized by the reward function to encourge safe control without relying on the shield.
  • Figure 4: Randomly generated urban traffic environments in PyAD-RL. Blue vehicles are simulated vehicles with rule-based IDM behavior and the red vehicle is controlled by the agent policy.
  • Figure 5: Illustration of suspected vehicles at partly concealed intersections in the PyAD-RL environment. Here, $d_{intersection}$ also serves as the necessary distance measure for the IDM model.
  • ...and 1 more figures