Shared learning of powertrain control policies for vehicle fleets
Lindsey Kerbel, Beshah Ayalew, Andrej Ivanco
TL;DR
This paper introduces elsarticle.cls, a rewritten LaTeX document class designed to standardize formatting for Elsevier submissions. Built on the standard article.cls, it minimizes package conflicts and integrates with common tools such as natbib and hyperref while supporting multiple document layouts, including preprint and final formats. It documents the class’s dependencies, differences from the older elsart.cls, and practical installation guidance via Elsevier resources and CTAN. The result is a robust, portable class that streamlines manuscript preparation for Elsevier venues and reduces compilation issues across diverse environments.
Abstract
Emerging data-driven approaches, such as deep reinforcement learning (DRL), aim at on-the-field learning of powertrain control policies that optimize fuel economy and other performance metrics. Indeed, they have shown great potential in this regard for individual vehicles on specific routes or drive cycles. However, for fleets of vehicles that must service a distribution of routes, DRL approaches struggle with learning stability issues that result in high variances and challenge their practical deployment. In this paper, we present a novel framework for shared learning among a fleet of vehicles through the use of a distilled group policy as the knowledge sharing mechanism for the policy learning computations at each vehicle. We detail the mathematical formulation that makes this possible. Several scenarios are considered to analyze the functionality, performance, and computational scalability of the framework with fleet size. Comparisons of the cumulative performance of fleets using our proposed shared learning approach with a baseline of individual learning agents and another state-of-the-art approach with a centralized learner show clear advantages to our approach. For example, we find a fleet average asymptotic improvement of 8.5 percent in fuel economy compared to the baseline while also improving on the metrics of acceleration error and shifting frequency for fleets serving a distribution of suburban routes. Furthermore, we include demonstrative results that show how the framework reduces variance within a fleet and also how it helps individual agents adapt better to new routes.
