Table of Contents
Fetching ...

Learning-based Autonomous Oversteer Control and Collision Avoidance

Seokjun Lee, Seung-Hyun Kong

TL;DR

The paper addresses safe autonomous driving under oversteer with obstacle avoidance by introducing QC-SAC, a hybrid learning algorithm that learns from suboptimal demonstrations while rapidly adapting to new conditions. QC-SAC integrates three core ideas: Q-Compared Objective (QCO) for selective use of demonstrations, Q-Network from Demonstration (QNfD) to improve Q estimates with demonstration data, and Selective Demonstration Data Update (SDDU) plus Focused Experience Replay (FER) to accelerate learning from new successes and maintain data relevance. The authors validate their approach on a novel benchmark inspired by real driver training, demonstrating near-optimal policies with a significantly higher success rate than IL, RL, and HL baselines. This work provides a practical end-to-end framework for simultaneous oversteer control and collision avoidance, with potential to improve safety in real-world autonomous driving under challenging road conditions.

Abstract

Oversteer, wherein a vehicle's rear tires lose traction and induce unintentional excessive yaw, poses critical safety challenges. Failing to control oversteer often leads to severe traffic accidents. Although recent autonomous driving efforts have attempted to handle oversteer through stabilizing maneuvers, the majority rely on expert-defined trajectories or assume obstacle-free environments, limiting real-world applicability. This paper introduces a novel end-to-end (E2E) autonomous driving approach that tackles oversteer control and collision avoidance simultaneously. Existing E2E techniques, including Imitation Learning (IL), Reinforcement Learning (RL), and Hybrid Learning (HL), generally require near-optimal demonstrations or extensive experience. Yet even skilled human drivers struggle to provide perfect demonstrations under oversteer, and high transition variance hinders accumulating sufficient data. Hence, we present Q-Compared Soft Actor-Critic (QC-SAC), a new HL algorithm that effectively learns from suboptimal demonstration data and adapts rapidly to new conditions. To evaluate QC-SAC, we introduce a benchmark inspired by real-world driver training: a vehicle encounters sudden oversteer on a slippery surface and must avoid randomly placed obstacles ahead. Experimental results show QC-SAC attains near-optimal driving policies, significantly surpassing state-of-the-art IL, RL, and HL baselines. Our method demonstrates the world's first safe autonomous oversteer control with obstacle avoidance.

Learning-based Autonomous Oversteer Control and Collision Avoidance

TL;DR

The paper addresses safe autonomous driving under oversteer with obstacle avoidance by introducing QC-SAC, a hybrid learning algorithm that learns from suboptimal demonstrations while rapidly adapting to new conditions. QC-SAC integrates three core ideas: Q-Compared Objective (QCO) for selective use of demonstrations, Q-Network from Demonstration (QNfD) to improve Q estimates with demonstration data, and Selective Demonstration Data Update (SDDU) plus Focused Experience Replay (FER) to accelerate learning from new successes and maintain data relevance. The authors validate their approach on a novel benchmark inspired by real driver training, demonstrating near-optimal policies with a significantly higher success rate than IL, RL, and HL baselines. This work provides a practical end-to-end framework for simultaneous oversteer control and collision avoidance, with potential to improve safety in real-world autonomous driving under challenging road conditions.

Abstract

Oversteer, wherein a vehicle's rear tires lose traction and induce unintentional excessive yaw, poses critical safety challenges. Failing to control oversteer often leads to severe traffic accidents. Although recent autonomous driving efforts have attempted to handle oversteer through stabilizing maneuvers, the majority rely on expert-defined trajectories or assume obstacle-free environments, limiting real-world applicability. This paper introduces a novel end-to-end (E2E) autonomous driving approach that tackles oversteer control and collision avoidance simultaneously. Existing E2E techniques, including Imitation Learning (IL), Reinforcement Learning (RL), and Hybrid Learning (HL), generally require near-optimal demonstrations or extensive experience. Yet even skilled human drivers struggle to provide perfect demonstrations under oversteer, and high transition variance hinders accumulating sufficient data. Hence, we present Q-Compared Soft Actor-Critic (QC-SAC), a new HL algorithm that effectively learns from suboptimal demonstration data and adapts rapidly to new conditions. To evaluate QC-SAC, we introduce a benchmark inspired by real-world driver training: a vehicle encounters sudden oversteer on a slippery surface and must avoid randomly placed obstacles ahead. Experimental results show QC-SAC attains near-optimal driving policies, significantly surpassing state-of-the-art IL, RL, and HL baselines. Our method demonstrates the world's first safe autonomous oversteer control with obstacle avoidance.

Paper Structure

This paper contains 17 sections, 13 equations, 7 figures, 1 table, 1 algorithm.

Figures (7)

  • Figure 1: Vehicle oversteer and understeer. (a) Oversteer: the rear tires lose the grip, and the vehicle rotates more than intended. (b) Understeer: the front tires lose the grip, and the vehicle turns less than expected.
  • Figure 2: Research goals. (a) Ego vehicle in brown must control the oversteer in order not to spin. (b) It should also avoid the obstacle (i.e., front vehicle) in blue.
  • Figure 3: Concept diagram. Impact of the quality of demonstration data on the training of action policy using existing HL techniques. (a) IL helps the training, when optimal demonstration data is given. (b) IL deteriorates the training, when immature demonstration data is given.
  • Figure 4: Real-world driver training process and oversteer control and collision avoidance benchmark in a virtual environment. a, Kick plate inducing oversteer, and b, a collision avoidance scenario for driver training at BMW Driving Center, South Korea. c, Oversteer control and collision avoidance benchmark developed in IPG CarMaker simulator.
  • Figure 5: Experimental setup. (a) Definition of $d$ and $\psi$ in vehicle state. (b) Representation of surrounding state. (c) An example of the reward function. ($\bar{x}=1$).
  • ...and 2 more figures