Table of Contents
Fetching ...

NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation

Momin Haider, Ming Yin, Menglei Zhang, Arpit Gupta, Jing Zhu, Yu-Xiang Wang

TL;DR

This paper introduces NetworkGym, a high-fidelity network environment simulator that facilitates generating multiple network traffic flows and multi-access traffic splitting and proposes an extension to the TD3+BC algorithm, named Pessimistic TD3 (PTD3), and demonstrates that it outperforms many state-of-the-art offline RL algorithms.

Abstract

Mobile devices such as smartphones, laptops, and tablets can often connect to multiple access networks (e.g., Wi-Fi, LTE, and 5G) simultaneously. Recent advancements facilitate seamless integration of these connections below the transport layer, enhancing the experience for apps that lack inherent multi-path support. This optimization hinges on dynamically determining the traffic distribution across networks for each device, a process referred to as \textit{multi-access traffic splitting}. This paper introduces \textit{NetworkGym}, a high-fidelity network environment simulator that facilitates generating multiple network traffic flows and multi-access traffic splitting. This simulator facilitates training and evaluating different RL-based solutions for the multi-access traffic splitting problem. Our initial explorations demonstrate that the majority of existing state-of-the-art offline RL algorithms (e.g. CQL) fail to outperform certain hand-crafted heuristic policies on average. This illustrates the urgent need to evaluate offline RL algorithms against a broader range of benchmarks, rather than relying solely on popular ones such as D4RL. We also propose an extension to the TD3+BC algorithm, named Pessimistic TD3 (PTD3), and demonstrate that it outperforms many state-of-the-art offline RL algorithms. PTD3's behavioral constraint mechanism, which relies on value-function pessimism, is theoretically motivated and relatively simple to implement.

NetworkGym: Reinforcement Learning Environments for Multi-Access Traffic Management in Network Simulation

TL;DR

This paper introduces NetworkGym, a high-fidelity network environment simulator that facilitates generating multiple network traffic flows and multi-access traffic splitting and proposes an extension to the TD3+BC algorithm, named Pessimistic TD3 (PTD3), and demonstrates that it outperforms many state-of-the-art offline RL algorithms.

Abstract

Mobile devices such as smartphones, laptops, and tablets can often connect to multiple access networks (e.g., Wi-Fi, LTE, and 5G) simultaneously. Recent advancements facilitate seamless integration of these connections below the transport layer, enhancing the experience for apps that lack inherent multi-path support. This optimization hinges on dynamically determining the traffic distribution across networks for each device, a process referred to as \textit{multi-access traffic splitting}. This paper introduces \textit{NetworkGym}, a high-fidelity network environment simulator that facilitates generating multiple network traffic flows and multi-access traffic splitting. This simulator facilitates training and evaluating different RL-based solutions for the multi-access traffic splitting problem. Our initial explorations demonstrate that the majority of existing state-of-the-art offline RL algorithms (e.g. CQL) fail to outperform certain hand-crafted heuristic policies on average. This illustrates the urgent need to evaluate offline RL algorithms against a broader range of benchmarks, rather than relying solely on popular ones such as D4RL. We also propose an extension to the TD3+BC algorithm, named Pessimistic TD3 (PTD3), and demonstrate that it outperforms many state-of-the-art offline RL algorithms. PTD3's behavioral constraint mechanism, which relies on value-function pessimism, is theoretically motivated and relatively simple to implement.

Paper Structure

This paper contains 16 sections, 6 equations, 2 figures, 6 tables, 1 algorithm.

Figures (2)

  • Figure 1: GMA Protocol. A UE interfaces with the GMA gateway over UDP. "APP" refers to the application layer at the client or server level, "IP" refers to the Internet Protocol layer, facilitating the addressing and routing of packets, and "PHY" refers to the physical layer in the network responsible for the actual transmission of data over the network medium. The GMA gateway handles multi-access traffic splitting at the edge.
  • Figure 2: Environment configuration for offline RL testing (not-to-scale). Here, we randomly initialize four UE's 1.5 meters above the $x$-axis and they move back and forth in the $x$-direction between $x=0$ meters and $x=80$ meters. The Wi-Fi access point locations are $(x, z) = (30\text{m}, 3\text{m})$ and $(x, z) = (50\text{m}, 3\text{m})$ while the LTE base station location is $(x, z) = (40\text{m}, 3\text{m})$.