Eco-driving for Electric Connected Vehicles at Signalized Intersections: A Parameterized Reinforcement Learning approach
Xia Jiang, Jian Zhang, Dan Li
TL;DR
This work tackles eco-driving for electric connected vehicles at signalized intersections by casting the control problem as a parameterized reinforcement learning task. It fuses model-based car-following and lane-changing policies with a DRL policy in a unified MDP featuring a discrete-continuous hybrid action space, enabling joint longitudinal and lateral decisions. The PRL framework uses two neural networks to select a discrete lane-change action and a corresponding continuous acceleration, with safety clips and masks to preserve safe operation; the reward combines energy and mobility considerations to encourage smooth, energy-efficient trajectories. Evaluations in SUMO demonstrate energy reductions of up to $27.13\%$ over baselines without compromising traffic flow, and show robustness across coordinate and non-coordinate signal settings as well as mixed traffic with varying CV penetration, highlighting practical potential for real-world deployment.
Abstract
This paper proposes an eco-driving framework for electric connected vehicles (CVs) based on reinforcement learning (RL) to improve vehicle energy efficiency at signalized intersections. The vehicle agent is specified by integrating the model-based car-following policy, lane-changing policy, and the RL policy, to ensure safe operation of a CV. Subsequently, a Markov Decision Process (MDP) is formulated, which enables the vehicle to perform longitudinal control and lateral decisions, jointly optimizing the car-following and lane-changing behaviors of the CVs in the vicinity of intersections. Then, the hybrid action space is parameterized as a hierarchical structure and thereby trains the agents with two-dimensional motion patterns in a dynamic traffic environment. Finally, our proposed methods are evaluated in SUMO software from both a single-vehicle-based perspective and a flow-based perspective. The results show that our strategy can significantly reduce energy consumption by learning proper action schemes without any interruption of other human-driven vehicles (HDVs).
