Learning to Drive in a Day
Alex Kendall, Jeffrey Hawke, David Janz, Przemyslaw Mazur, Daniele Reda, John-Mark Allen, Vinh-Dieu Lam, Alex Bewley, Amar Shah
TL;DR
The paper demonstrates the first application of deep reinforcement learning to autonomous driving by framing lane-following as an MDP and solving it with on-vehicle training using a monocular image input. It uses Deep Deterministic Policy Gradients with a simple two-dimensional continuous action space and a sparse reward based on distance traveled before driver intervention, validated in both a Unity-like simulation and a real Renault Twizy. A task-based, on-vehicle training architecture and a VAE-based state representation are explored, with the VAE improving data efficiency in real-world experiments. The work shows RL can learn to drive with minimal supervision and no reliance on pre-defined maps, while identifying critical future directions in reward design, representation learning, and domain transfer for scaling to broader autonomous driving tasks.
Abstract
We demonstrate the first application of deep reinforcement learning to autonomous driving. From randomly initialised parameters, our model is able to learn a policy for lane following in a handful of training episodes using a single monocular image as input. We provide a general and easy to obtain reward: the distance travelled by the vehicle without the safety driver taking control. We use a continuous, model-free deep reinforcement learning algorithm, with all exploration and optimisation performed on-vehicle. This demonstrates a new framework for autonomous driving which moves away from reliance on defined logical rules, mapping, and direct supervision. We discuss the challenges and opportunities to scale this approach to a broader range of autonomous driving tasks.
