Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving

Guanzhou Li; Jianping Wu; Yujing He

Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving

Guanzhou Li, Jianping Wu, Yujing He

TL;DR

This work addresses the challenge of safe, efficient decision-making for autonomous vehicles in highly interactive, uncertain traffic. It introduces timing-aware reinforcement learning, pairing an actor with a timing-taker and a conservative base planner, and employs a timing imagination to preview action execution across multiple time scales, producing a dynamic safety factor. The method yields superior performance over strong Safe RL baselines in unsignalized intersections and roundabouts, with higher success rates and shorter crossing times, while mitigating the planning-based action freezing problem. The approach advances practical autonomous driving by leveraging environment dynamics and timing to balance long-horizon rewards with immediate safety, and it offers potential for integration with more powerful planners or end-to-end models in broader robotics tasks.

Abstract

Autonomous vehicles inevitably encounter a vast array of scenarios in real-world environments. Addressing long-tail scenarios, particularly those involving intensive interactions with numerous traffic participants, remains one of the most significant challenges in achieving high-level autonomous driving. Reinforcement learning (RL) offers a promising solution for such scenarios and allows autonomous vehicles to continuously self-evolve during interactions. However, traditional RL often requires trial and error from scratch in new scenarios, resulting in inefficient exploration of unknown states. Integrating RL with planning-based methods can significantly accelerate the learning process. Additionally, conventional RL methods lack robust safety mechanisms, making agents prone to collisions in dynamic environments in pursuit of short-term rewards. Many existing safe RL methods depend on environment modeling to identify reliable safety boundaries for constraining agent behavior. However, explicit environmental models can fail to capture the complexity of dynamic environments comprehensively. Inspired by the observation that human drivers rarely take risks in uncertain situations, this study introduces the concept of action timing and proposes a timing-aware RL method, In this approach, a "timing imagination" process previews the execution results of the agent's strategies at different time scales. The optimal execution timing is then projected to each decision moment, generating a dynamic safety factor to constrain actions. A planning-based method serves as a conservative baseline strategy in uncertain states. In two representative interaction scenarios, an unsignalized intersection and a roundabout, the proposed model outperforms the benchmark models in driving safety.

Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving

TL;DR

Abstract

Act Better by Timing: A timing-Aware Reinforcement Learning for Autonomous Driving

Authors

TL;DR

Abstract

Table of Contents

Figures (7)