A Framework for Learning Scoring Rules in Autonomous Driving Planning Systems
Zikang Xiong, Joe Kurian Eappen, Suresh Jagannathan
TL;DR
This work presents FLoRA, a framework that learns interpretable scoring rules for autonomous driving by representing rules as differentiable temporal-logic formulas. A Scoring Logic Network with Temporal, Propositional, and Aggregation layers learns both the rule structure and predicate thresholds from positive-driving demonstrations in NuPlan, using a regularization scheme to avoid shortcuts. The learned rules are extracted into human-readable condition-action pairs and evaluated in NuPlan closed-loop simulations, where SLN outperforms expert-crafted rules and neural critics across multiple proposers while preserving interpretability. The framework acts as a plug-in module that can enhance safety, reliability, and transparency in motion planning, with potential extensions to predicate discovery and accident-based supervision.
Abstract
In autonomous driving systems, motion planning is commonly implemented as a two-stage process: first, a trajectory proposer generates multiple candidate trajectories, then a scoring mechanism selects the most suitable trajectory for execution. For this critical selection stage, rule-based scoring mechanisms are particularly appealing as they can explicitly encode driving preferences, safety constraints, and traffic regulations in a formalized, human-understandable format. However, manually crafting these scoring rules presents significant challenges: the rules often contain complex interdependencies, require careful parameter tuning, and may not fully capture the nuances present in real-world driving data. This work introduces FLoRA, a novel framework that bridges this gap by learning interpretable scoring rules represented in temporal logic. Our method features a learnable logic structure that captures nuanced relationships across diverse driving scenarios, optimizing both rules and parameters directly from real-world driving demonstrations collected in NuPlan. Our approach effectively learns to evaluate driving behavior even though the training data only contains positive examples (successful driving demonstrations). Evaluations in closed-loop planning simulations demonstrate that our learned scoring rules outperform existing techniques, including expert-designed rules and neural network scoring models, while maintaining interpretability. This work introduces a data-driven approach to enhance the scoring mechanism in autonomous driving systems, designed as a plug-in module to seamlessly integrate with various trajectory proposers. Our video and code are available on xiong.zikang.me/FLoRA.
