Table of Contents
Fetching ...

Autonomous vehicle decision and control through reinforcement learning with traffic flow randomization

Yuan Lin, Antai Xie, Xiao Liu

TL;DR

The paper tackles the sim-to-real transfer problem for reinforcement learning-based autonomous driving by showing that domain randomization of rule-based microscopic traffic flows improves generalization to diverse traffic scenes. It randomizes IDM and SL2015 parameters with Gaussian distributions to create varied driving behaviours during training in SUMO, and compares against non-randomized and LimSim high-fidelity traffic flows in merging and freeway scenarios. Policies trained under domain-randomized traffic achieve high success rates and rewards across multiple traffic-flow types and densities, while those trained without randomization struggle under domain mismatch; high-fidelity traffic flow provides a stronger testing environment but is less effective for training due to longer run times and poorer generalization. The work demonstrates a practical path toward robust sim-to-real transfer for autonomous vehicle decision and control, with future work extending validation to real vehicles.

Abstract

Most of the current studies on autonomous vehicle decision-making and control tasks based on reinforcement learning are conducted in simulated environments. The training and testing of these studies are carried out under rule-based microscopic traffic flow, with little consideration of migrating them to real or near-real environments to test their performance. It may lead to a degradation in performance when the trained model is tested in more realistic traffic scenes. In this study, we propose a method to randomize the driving style and behavior of surrounding vehicles by randomizing certain parameters of the car-following model and the lane-changing model of rule-based microscopic traffic flow in SUMO. We trained policies with deep reinforcement learning algorithms under the domain randomized rule-based microscopic traffic flow in freeway and merging scenes, and then tested them separately in rule-based microscopic traffic flow and high-fidelity microscopic traffic flow. Results indicate that the policy trained under domain randomization traffic flow has significantly better success rate and calculative reward compared to the models trained under other microscopic traffic flows.

Autonomous vehicle decision and control through reinforcement learning with traffic flow randomization

TL;DR

The paper tackles the sim-to-real transfer problem for reinforcement learning-based autonomous driving by showing that domain randomization of rule-based microscopic traffic flows improves generalization to diverse traffic scenes. It randomizes IDM and SL2015 parameters with Gaussian distributions to create varied driving behaviours during training in SUMO, and compares against non-randomized and LimSim high-fidelity traffic flows in merging and freeway scenarios. Policies trained under domain-randomized traffic achieve high success rates and rewards across multiple traffic-flow types and densities, while those trained without randomization struggle under domain mismatch; high-fidelity traffic flow provides a stronger testing environment but is less effective for training due to longer run times and poorer generalization. The work demonstrates a practical path toward robust sim-to-real transfer for autonomous vehicle decision and control, with future work extending validation to real vehicles.

Abstract

Most of the current studies on autonomous vehicle decision-making and control tasks based on reinforcement learning are conducted in simulated environments. The training and testing of these studies are carried out under rule-based microscopic traffic flow, with little consideration of migrating them to real or near-real environments to test their performance. It may lead to a degradation in performance when the trained model is tested in more realistic traffic scenes. In this study, we propose a method to randomize the driving style and behavior of surrounding vehicles by randomizing certain parameters of the car-following model and the lane-changing model of rule-based microscopic traffic flow in SUMO. We trained policies with deep reinforcement learning algorithms under the domain randomized rule-based microscopic traffic flow in freeway and merging scenes, and then tested them separately in rule-based microscopic traffic flow and high-fidelity microscopic traffic flow. Results indicate that the policy trained under domain randomization traffic flow has significantly better success rate and calculative reward compared to the models trained under other microscopic traffic flows.
Paper Structure (23 sections, 21 equations, 4 figures, 9 tables)

This paper contains 23 sections, 21 equations, 4 figures, 9 tables.

Figures (4)

  • Figure S1: Merging in SUMO.
  • Figure S2: Undiscounted episode reward during training under three traffic flows.
  • Figure S3: The ego vehicle overtakes along the arrow trajectory in the freeway.
  • Figure S4: Undiscounted episode reward during training under three traffic flows.