Virtual to Real Reinforcement Learning for Autonomous Driving
Xinlei Pan, Yurong You, Ziyan Wang, Cewu Lu
TL;DR
The paper tackles the sim-to-real transfer problem in autonomous driving for reinforcement learning by introducing a two-stage realistic translation network that uses scene parsing as an intermediate representation. Virtual frames are translated to parsing maps and then to realistic frames, enabling an A3C-driving policy trained entirely in simulated visuals to operate in real-world driving. Experiments show that policies trained with translated realistic imagery generalize better to real driving than purely virtual-training or domain-randomized approaches, with transfer learning across virtual environments demonstrating improved robustness. This VR-RL framework offers a safer, cost-effective pathway for training autonomous driving policies before real-world deployment, and highlights future work to diversify appearance while preserving structure to further mitigate biases.
Abstract
Reinforcement learning is considered as a promising direction for driving policy learning. However, training autonomous driving vehicle with reinforcement learning in real environment involves non-affordable trial-and-error. It is more desirable to first train in a virtual environment and then transfer to the real environment. In this paper, we propose a novel realistic translation network to make model trained in virtual environment be workable in real world. The proposed network can convert non-realistic virtual image input into a realistic one with similar scene structure. Given realistic frames as input, driving policy trained by reinforcement learning can nicely adapt to real world driving. Experiments show that our proposed virtual to real (VR) reinforcement learning (RL) works pretty well. To our knowledge, this is the first successful case of driving policy trained by reinforcement learning that can adapt to real world driving data.
