Table of Contents
Fetching ...

A Safe and Efficient Self-evolving Algorithm for Decision-making and Control of Autonomous Driving Systems

Shuo Yang, Liwen Wang, Yanjun Huang, Hong Chen

TL;DR

The paper tackles safety and learning-efficiency challenges in RL-based autonomous driving by introducing a hybrid Mechanism-Experience-Learning framework. It combines a safety-constrained mechanism, a driving-tendency network to prune the search space, and a soft-actor-critic policy with an MPC-style optimizer to produce safe, efficient driving actions. Key contributions include: (1) a driving-tendency concept guiding policy evolution, (2) an IDM-based traffic-interaction model for safe interaction with surrounding vehicles, and (3) a driving-tendency–driven, MPC-inspired optimization that achieves collision-free training with rapid convergence in complex scenarios. The approach demonstrates zero-collision training and superior performance relative to MPC and RL baselines in dynamic traffic scenarios, suggesting strong practical potential for safe online self-evolution in autonomous driving.

Abstract

Autonomous vehicles with a self-evolving ability are expected to cope with unknown scenarios in the real-world environment. Take advantage of trial and error mechanism, reinforcement learning is able to self evolve by learning the optimal policy, and it is particularly well suitable for solving decision-making problems. However, reinforcement learning suffers from safety issues and low learning efficiency, especially in the continuous action space. Therefore, the motivation of this paper is to address the above problem by proposing a hybrid Mechanism-Experience-Learning augmented approach. Specifically, to realize the efficient self-evolution, the driving tendency by analogy with human driving experience is proposed to reduce the search space of the autonomous driving problem, while the constrained optimization problem based on a mechanistic model is designed to ensure safety during the self-evolving process. Experimental results show that the proposed method is capable of generating safe and reasonable actions in various complex scenarios, improving the performance of the autonomous driving system. Compared to conventional reinforcement learning, the safety and efficiency of the proposed algorithm are greatly improved. The training process is collision-free, and the training time is equivalent to less than 10 minutes in the real world.

A Safe and Efficient Self-evolving Algorithm for Decision-making and Control of Autonomous Driving Systems

TL;DR

The paper tackles safety and learning-efficiency challenges in RL-based autonomous driving by introducing a hybrid Mechanism-Experience-Learning framework. It combines a safety-constrained mechanism, a driving-tendency network to prune the search space, and a soft-actor-critic policy with an MPC-style optimizer to produce safe, efficient driving actions. Key contributions include: (1) a driving-tendency concept guiding policy evolution, (2) an IDM-based traffic-interaction model for safe interaction with surrounding vehicles, and (3) a driving-tendency–driven, MPC-inspired optimization that achieves collision-free training with rapid convergence in complex scenarios. The approach demonstrates zero-collision training and superior performance relative to MPC and RL baselines in dynamic traffic scenarios, suggesting strong practical potential for safe online self-evolution in autonomous driving.

Abstract

Autonomous vehicles with a self-evolving ability are expected to cope with unknown scenarios in the real-world environment. Take advantage of trial and error mechanism, reinforcement learning is able to self evolve by learning the optimal policy, and it is particularly well suitable for solving decision-making problems. However, reinforcement learning suffers from safety issues and low learning efficiency, especially in the continuous action space. Therefore, the motivation of this paper is to address the above problem by proposing a hybrid Mechanism-Experience-Learning augmented approach. Specifically, to realize the efficient self-evolution, the driving tendency by analogy with human driving experience is proposed to reduce the search space of the autonomous driving problem, while the constrained optimization problem based on a mechanistic model is designed to ensure safety during the self-evolving process. Experimental results show that the proposed method is capable of generating safe and reasonable actions in various complex scenarios, improving the performance of the autonomous driving system. Compared to conventional reinforcement learning, the safety and efficiency of the proposed algorithm are greatly improved. The training process is collision-free, and the training time is equivalent to less than 10 minutes in the real world.
Paper Structure (23 sections, 31 equations, 11 figures, 4 tables, 1 algorithm)

This paper contains 23 sections, 31 equations, 11 figures, 4 tables, 1 algorithm.

Figures (11)

  • Figure 1: Overall architecture of safe and efficient self-evolving algorithm for decision making and control.
  • Figure 2: Operating logic of driving tendency.
  • Figure 3: Schematic diagram of driving tendency in different scenarios
  • Figure 4: Traffic vehicles with potential impact on the ego vehicle.
  • Figure 5: Diagram of driving tendencies and optimization trajectory.
  • ...and 6 more figures