Multi-agent Assessment with QoS Enhancement for HD Map Updates in a Vehicular Network
Jeffrey Redondo, Nauman Aslam, Juan Zhang, Zhenhui Yuan
TL;DR
This work addresses the challenge of delivering QoS for HD Map updates in VANETs by proposing a scalable, application-layer multi-agent Q-learning framework that avoids modifying existing MAC standards. By distributing learning across independent agents and sharing a common reward function, the approach reduces state/action dimensionality and computational burden while enhancing latency and throughput for critical services. Across test cases, multi-agent configurations outperform a single-agent baseline, achieving up to approximately 40% latency reduction and notable throughput improvements, with distributed learning offering greater packet efficiency and fairness in certain scenarios. The findings suggest that carefully designed multi-agent RL at the application layer can meaningfully improve HD Map offloading QoS in dynamic vehicular networks, with centralized learning serving as a viable option when AVs have limited computational capacity and distributed learning offering advantages when AVs possess higher processing power.
Abstract
Reinforcement Learning (RL) algorithms have been used to address the challenging problems in the offloading process of vehicular ad hoc networks (VANET). More recently, they have been utilized to improve the dissemination of high-definition (HD) Maps. Nevertheless, implementing solutions such as deep Q-learning (DQN) and Actor-critic at the autonomous vehicle (AV) may lead to an increase in the computational load, causing a heavy burden on the computational devices and higher costs. Moreover, their implementation might raise compatibility issues between technologies due to the required modifications to the standards. Therefore, in this paper, we assess the scalability of an application utilizing a Q-learning single-agent solution in a distributed multi-agent environment. This application improves the network performance by taking advantage of a smaller state, and action space whilst using a multi-agent approach. The proposed solution is extensively evaluated with different test cases involving reward function considering individual or overall network performance, number of agents, and centralized and distributed learning comparison. The experimental results demonstrate that the time latencies of our proposed solution conducted in voice, video, HD Map, and best-effort cases have significant improvements, with 40.4%, 36%, 43%, and 12% respectively, compared to the performances with the single-agent approach.
