Table of Contents
Fetching ...

Joint Optimization on Uplink OFDMA and MU-MIMO for IEEE 802.11ax: Deep Hierarchical Reinforcement Learning Approach

Hyeonho Noh, Harim Lee, Hyun Jong Yang

TL;DR

The paper tackles the challenge of jointly optimizing uplink USRA, MU-MIMO user selection, and MIMO mode selection for IEEE 802.11ax uplink OFDMA under unsaturated traffic. It introduces a tailored deep hierarchical reinforcement learning (DHRL) framework with a master agent selecting RU configurations and sub-agents performing MU-MIMO scheduling, enhanced by a two-branch network for CSI and buffers and a channel-subspace update to mitigate interference. Key contributions include a formal problem formulation with RU, MIMO MS, and buffer constraints, a DHQN architecture with reduced MU-MIMO action spaces, and complexity-aware training and evaluation showing substantial throughput gains across scenarios. The approach promises practical improvements for dense WLAN uplinks, particularly as bandwidth and antenna counts scale, by efficiently navigating the large joint action space without sacrificing generality.

Abstract

This letter tackles a joint user scheduling, frequency resource allocation (USRA), multi-input-multi-output mode selection (MIMO MS) between single-user MIMO and multi-user (MU) MIMO, and MU-MIMO user selection problem, integrating uplink orthogonal frequency division multiple access (OFDMA) in IEEE 802.11ax. Specifically, we focus on \textit{unsaturated traffic conditions} where users' data demands fluctuate. In unsaturated traffic conditions, considering packet volumes per user introduces a combinatorial problem, requiring the simultaneous optimization of MU-MIMO user selection and RA along the time-frequency-space axis. Consequently, dealing with the combinatorial nature of this problem, characterized by a large cardinality of unknown variables, poses a challenge that conventional optimization methods find nearly impossible to address. In response, this letter proposes an approach with deep hierarchical reinforcement learning (DHRL) to solve the joint problem. Rather than simply adopting off-the-shelf DHRL, we \textit{tailor} the DHRL to the joint USRA and MS problem, thereby significantly improving the convergence speed and throughput. Extensive simulation results show that the proposed algorithm achieves significantly improved throughput compared to the existing schemes under various unsaturated traffic conditions.

Joint Optimization on Uplink OFDMA and MU-MIMO for IEEE 802.11ax: Deep Hierarchical Reinforcement Learning Approach

TL;DR

The paper tackles the challenge of jointly optimizing uplink USRA, MU-MIMO user selection, and MIMO mode selection for IEEE 802.11ax uplink OFDMA under unsaturated traffic. It introduces a tailored deep hierarchical reinforcement learning (DHRL) framework with a master agent selecting RU configurations and sub-agents performing MU-MIMO scheduling, enhanced by a two-branch network for CSI and buffers and a channel-subspace update to mitigate interference. Key contributions include a formal problem formulation with RU, MIMO MS, and buffer constraints, a DHQN architecture with reduced MU-MIMO action spaces, and complexity-aware training and evaluation showing substantial throughput gains across scenarios. The approach promises practical improvements for dense WLAN uplinks, particularly as bandwidth and antenna counts scale, by efficiently navigating the large joint action space without sacrificing generality.

Abstract

This letter tackles a joint user scheduling, frequency resource allocation (USRA), multi-input-multi-output mode selection (MIMO MS) between single-user MIMO and multi-user (MU) MIMO, and MU-MIMO user selection problem, integrating uplink orthogonal frequency division multiple access (OFDMA) in IEEE 802.11ax. Specifically, we focus on \textit{unsaturated traffic conditions} where users' data demands fluctuate. In unsaturated traffic conditions, considering packet volumes per user introduces a combinatorial problem, requiring the simultaneous optimization of MU-MIMO user selection and RA along the time-frequency-space axis. Consequently, dealing with the combinatorial nature of this problem, characterized by a large cardinality of unknown variables, poses a challenge that conventional optimization methods find nearly impossible to address. In response, this letter proposes an approach with deep hierarchical reinforcement learning (DHRL) to solve the joint problem. Rather than simply adopting off-the-shelf DHRL, we \textit{tailor} the DHRL to the joint USRA and MS problem, thereby significantly improving the convergence speed and throughput. Extensive simulation results show that the proposed algorithm achieves significantly improved throughput compared to the existing schemes under various unsaturated traffic conditions.
Paper Structure (17 sections, 5 equations, 9 figures, 1 table, 2 algorithms)

This paper contains 17 sections, 5 equations, 9 figures, 1 table, 2 algorithms.

Figures (9)

  • Figure 1: Illustration of 802.11ax UL data transmission protocol.
  • Figure 2: OFDMA RUs in a 20 MHz channel.
  • Figure 3: An illustrative example of the when scheduling STAs and allocating RUs with conventional and proposed methods.
  • Figure 4: Illustration of the proposed DHRL model structure. (a) The comparison between the conventional and proposed DHRL model in a 20 MHz channel. A master agent performs RA, and sequentially sub-agents schedule STAs on the allocated RUs. Then, the AP transmits a trigger frame containing the result of to the STAs. (b) The network structure of both master and sub-agents.
  • Figure 5: RU combination table corresponding to goals in a 20 MHz bandwidth channel.
  • ...and 4 more figures