Predictive Traffic Rule Compliance using Reinforcement Learning
Yanliang Huang, Sebastian Mair, Zhuoqi Zeng, Matthias Althoff
TL;DR
This work tackles predictive traffic rule compliance in autonomous driving by replacing the actor in an actor-critic framework with a cost-based motion planner, guided by the critic's state-value estimates. A graph neural network provides a rich, scalable representation of multi-vehicle traffic, and traffic rules are formalized via temporal logic robustness (STL) to shape rewards. Three German interstate rules, including a newly formalized R_I6, are treated with a hierarchical rule book to balance safety and efficiency, and a planner-generated macro-action explores feasible trajectories under safety constraints. Experiments on the highD dataset show improved long-horizon rule compliance for R_G1 and R_I6, with heatmap analyses indicating proactive, predictive behavior; however, complex interactions (notably for R_I2) reveal remaining challenges and the need for richer modeling and data. Overall, the hybrid planning-plus-prediction approach advances explainable, rule-aware autonomous planning with practical implications for safety-critical driving applications.
Abstract
Autonomous vehicle path planning has reached a stage where safety and regulatory compliance are crucial. This paper presents an approach that integrates a motion planner with a deep reinforcement learning model to predict potential traffic rule violations. Our main innovation is replacing the standard actor network in an actor-critic method with a motion planning module, which ensures both stable and interpretable trajectory generation. In this setup, we use traffic rule robustness as the reward to train a reinforcement learning agent's critic, and the output of the critic is directly used as the cost function of the motion planner, which guides the choices of the trajectory. We incorporate some key interstate rules from the German Road Traffic Regulation into a rule book and use a graph-based state representation to handle complex traffic information. Experiments on an open German highway dataset show that the model can predict and prevent traffic rule violations beyond the planning horizon, increasing safety and rule compliance in challenging traffic scenarios.
