Generalizing Cooperative Eco-driving via Multi-residual Task Learning
Vindula Jayawardana, Sirui Li, Cathy Wu, Yashar Farid, Kentaro Oguchi
TL;DR
This work tackles algorithmic generalization of DRL for contextual, multi-agent control by introducing Multi-residual Task Learning (MRTL), which augments a nominal, model-based policy with a learned residual to operate across diverse traffic scenarios. By applying MRTL to cooperative eco-driving at signalized intersections, the authors demonstrate improved emission reductions and throughput across a large-scale set of contexts (600 intersections, 1200 scenarios) and AV penetration levels, outperforming baselines and showing robustness to noise. The key idea is to decompose the control objective into a known, tractable component handled by the nominal policy and a residual component learned by DRL, with the final policy given by $\pi(s,c) = \pi_n(s,c) + f_\theta(s,c)$. The approach offers practical benefits for fleet-level emissions management and demonstrates how leveraging existing model-based strategies can significantly aid DRL generalization in complex, real-world traffic settings.
Abstract
Conventional control, such as model-based control, is commonly utilized in autonomous driving due to its efficiency and reliability. However, real-world autonomous driving contends with a multitude of diverse traffic scenarios that are challenging for these planning algorithms. Model-free Deep Reinforcement Learning (DRL) presents a promising avenue in this direction, but learning DRL control policies that generalize to multiple traffic scenarios is still a challenge. To address this, we introduce Multi-residual Task Learning (MRTL), a generic learning framework based on multi-task learning that, for a set of task scenarios, decomposes the control into nominal components that are effectively solved by conventional control methods and residual terms which are solved using learning. We employ MRTL for fleet-level emission reduction in mixed traffic using autonomous vehicles as a means of system control. By analyzing the performance of MRTL across nearly 600 signalized intersections and 1200 traffic scenarios, we demonstrate that it emerges as a promising approach to synergize the strengths of DRL and conventional methods in generalizable control.
