Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing

Andrew Murdoch; Johannes Cornelius Schoeman; Hendrik Willem Jordaan

Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing

Andrew Murdoch, Johannes Cornelius Schoeman, Hendrik Willem Jordaan

TL;DR

This work tackles the simulation-to-reality gap in autonomous racing caused by modelling errors in vehicle dynamics. It introduces a partial end-to-end reinforcement learning framework where an RL planner outputs a trajectory (path and velocity) in the Frenet frame, which is then tracked by a pure pursuit controller and a velocity controller, leveraging track geometry for safety. Compared with fully end-to-end baselines, the approach shows fewer crashes and higher success rates, especially on complex tracks, under a range of model mismatches including friction, tire stiffness, and mass variations. The key contributions include explicit Frenet-frame path generation, integration with classical controllers, and a robust evaluation demonstrating improved safety and training efficiency for real-world deployment in autonomous racing settings.

Abstract

In this paper, we address the issue of increasing the performance of reinforcement learning (RL) solutions for autonomous racing cars when navigating under conditions where practical vehicle modelling errors (commonly known as \emph{model mismatches}) are present. To address this challenge, we propose a partial end-to-end algorithm that decouples the planning and control tasks. Within this framework, an RL agent generates a trajectory comprising a path and velocity, which is subsequently tracked using a pure pursuit steering controller and a proportional velocity controller, respectively. In contrast, many current learning-based (i.e., reinforcement and imitation learning) algorithms utilise an end-to-end approach whereby a deep neural network directly maps from sensor data to control commands. By leveraging the robustness of a classical controller, our partial end-to-end driving algorithm exhibits better robustness towards model mismatches than standard end-to-end algorithms.

Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing

TL;DR

Abstract

Paper Structure (18 sections, 5 equations, 10 figures, 4 tables)

This paper contains 18 sections, 5 equations, 10 figures, 4 tables.

Introduction
Related Work
Summary of contributions
Structure of paper
End-to-end Algorithm
Partial End-to-end Algorithm
Path Generation
Controllers
Reinforcement learning applied to train autonomous racing algorithms
Simulation environment
Actor and Critic Networks
Reward function
Experiments and Results
Training performance
Racing without model mismatch
...and 3 more sections

Figures (10)

Figure 1: The common architectures utilised by autonomous driving algorithms. (a) The classic framework decouples perception, planning and control. (b) End-to-end approaches utilise a DNN to perform the entire driving task, and (c) approaches utilising the partial end-to-end framework use a DNN within the structure of the classic framework.
Figure 2: The end-to-end algorithm architecture consists of an RL agent which outputs control commands, as well as a velocity constraint.
Figure 3: The partial end-to-end racing algorithm, which comprises an RL planner agent, velocity and steering controllers, as well as a velocity constraint.
Figure 4: An illustration of the process of generating the polynomial path in the Frenet frame. (a) The vehicle coordinates are converted into the Frenet frame, then (b) a path is constructed within the Frenet frame, after which (c) the path is converted into Cartesian coordinates.
Figure 5: Percentage failed laps and average lap time of 10 partial and 10 fully end-to-end agents learning to race on the Barcelona-Catalunya track (left), as well as Monaco (right).
...and 5 more figures

Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing

TL;DR

Abstract

Partial End-to-end Reinforcement Learning for Robustness Against Modelling Error in Autonomous Racing

Authors

TL;DR

Abstract

Table of Contents

Figures (10)