DRL-Based Trajectory Tracking for Motion-Related Modules in Autonomous Driving
Yinda Xu, Lidong Yu
TL;DR
The paper addresses robust trajectory tracking for autonomous driving by removing reliance on accurate, stationary models. It introduces a DRL-based tracker trained with domain randomization, operating inside an MDP framework with ego-centric observations and a 2-D reference space, and it complements this tracker with a post-optimization stage (iLQR). Key contributions include the observation encoding scheme, two dynamics models (bicycle and unicycle), domain randomization strategies, a reward structure balancing tracking and smoothness, and a TD3-based training pipeline with two network scales. Empirical results show the DRL tracker achieves substantial accuracy gains across wide speed ranges, maintains robustness under noise, and demonstrates versatility across different dynamics, outperforming a pure-pursuit baseline and improving the initial trajectory fed to the optimizer. The work suggests a practical, model-agnostic pathway to more robust, data-driven motion planning and control in autonomous driving, with code released to accelerate adoption and further research.
Abstract
Autonomous driving systems are always built on motion-related modules such as the planner and the controller. An accurate and robust trajectory tracking method is indispensable for these motion-related modules as a primitive routine. Current methods often make strong assumptions about the model such as the context and the dynamics, which are not robust enough to deal with the changing scenarios in a real-world system. In this paper, we propose a Deep Reinforcement Learning (DRL)-based trajectory tracking method for the motion-related modules in autonomous driving systems. The representation learning ability of DL and the exploration nature of RL bring strong robustness and improve accuracy. Meanwhile, it enhances versatility by running the trajectory tracking in a model-free and data-driven manner. Through extensive experiments, we demonstrate both the efficiency and effectiveness of our method compared to current methods. Code and documentation are released to facilitate both further research and industrial deployment.
