Table of Contents
Fetching ...

Bandwidth Reservation for Time-Critical Vehicular Applications: A Multi-Operator Environment

Abdullah Al-Khatib, Abdullah Ahmed, Klaus Moessner, Holger Timinger

TL;DR

This work tackles the challenge of cost-effective bandwidth reservation for time-critical vehicular applications in a multi-MNO setting with dynamic pricing. It formulates the problem as a Markov Decision Process and introduces an area-wise training approach, integrating Temporal Fusion Transformer forecasts with a Dueling DQN framework and a multi-phase training regime that blends synthetic and real data. The results show significant cost reductions, up to 40%, demonstrating the practicality of the approach in edge-enabled vehicular networks. The proposed framework offers a scalable, adaptive solution for reliable, low-latency bandwidth access across overlapping MNO coverages, with potential for extension to multi-agent and QoS-aware scenarios.

Abstract

Onsite bandwidth reservation requests often face challenges such as price fluctuations and fairness issues due to unpredictable bandwidth availability and stringent latency requirements. Requesting bandwidth in advance can mitigate the impact of these fluctuations and ensure timely access to critical resources. In a multi-Mobile Network Operator (MNO) environment, vehicles need to select cost-effective and reliable resources for their safety-critical applications. This research aims to minimize resource costs by finding the best price among multiple MNOs. It formulates multi-operator scenarios as a Markov Decision Process (MDP), utilizing a Deep Reinforcement Learning (DRL) algorithm, specifically Dueling Deep Q-Learning. For efficient and stable learning, we propose a novel area-wise approach and an adaptive MDP synthetic close to the real environment. The Temporal Fusion Transformer (TFT) is used to handle time-dependent data and model training. Furthermore, the research leverages Amazon spot price data and adopts a multi-phase training approach, involving initial training on synthetic data, followed by real-world data. These phases enable the DRL agent to make informed decisions using insights from historical data and real-time observations. The results show that our model leads to significant cost reductions, up to 40%, compared to scenarios without a policy model in such a complex environment.

Bandwidth Reservation for Time-Critical Vehicular Applications: A Multi-Operator Environment

TL;DR

This work tackles the challenge of cost-effective bandwidth reservation for time-critical vehicular applications in a multi-MNO setting with dynamic pricing. It formulates the problem as a Markov Decision Process and introduces an area-wise training approach, integrating Temporal Fusion Transformer forecasts with a Dueling DQN framework and a multi-phase training regime that blends synthetic and real data. The results show significant cost reductions, up to 40%, demonstrating the practicality of the approach in edge-enabled vehicular networks. The proposed framework offers a scalable, adaptive solution for reliable, low-latency bandwidth access across overlapping MNO coverages, with potential for extension to multi-agent and QoS-aware scenarios.

Abstract

Onsite bandwidth reservation requests often face challenges such as price fluctuations and fairness issues due to unpredictable bandwidth availability and stringent latency requirements. Requesting bandwidth in advance can mitigate the impact of these fluctuations and ensure timely access to critical resources. In a multi-Mobile Network Operator (MNO) environment, vehicles need to select cost-effective and reliable resources for their safety-critical applications. This research aims to minimize resource costs by finding the best price among multiple MNOs. It formulates multi-operator scenarios as a Markov Decision Process (MDP), utilizing a Deep Reinforcement Learning (DRL) algorithm, specifically Dueling Deep Q-Learning. For efficient and stable learning, we propose a novel area-wise approach and an adaptive MDP synthetic close to the real environment. The Temporal Fusion Transformer (TFT) is used to handle time-dependent data and model training. Furthermore, the research leverages Amazon spot price data and adopts a multi-phase training approach, involving initial training on synthetic data, followed by real-world data. These phases enable the DRL agent to make informed decisions using insights from historical data and real-time observations. The results show that our model leads to significant cost reductions, up to 40%, compared to scenarios without a policy model in such a complex environment.

Paper Structure

This paper contains 24 sections, 24 equations, 11 figures, 1 table, 1 algorithm.

Figures (11)

  • Figure 1: An illustration of scenario description
  • Figure 2: An illustration of the proposed framework for bandwidth reservation with multi-phase training.
  • Figure 3: Learning curves showing the convergence of the different DQN methods, demonstrating that the suggested Dueling DQN algorithm yields the highest average episode rewards.
  • Figure 4: Agent rewards reveal convergence trends, with dotted lines indicating the reward trend flattening and the solid line depicting actual reward values.
  • Figure 5: Cumulative episode rewards are significantly lower when using the Dueling Deep Q-Network (DQN) compared to other methods.
  • ...and 6 more figures