IEEE 802.11bn Multi-AP Coordinated Spatial Reuse with Hierarchical Multi-Armed Bandits
Maksymilian Wojnar, Wojciech Ciezobka, Katarzyna Kosek-Szott, Krzysztof Rusek, Szymon Szott, David Nunez, Boris Bellalta
TL;DR
The paper addresses scheduling AP--station pairs for Coordinated Spatial Reuse (C-SR) in dense IEEE 802.11bn networks to boost throughput. It introduces a hierarchical Multi-Armed Bandit (MAB) framework with two levels—level I selects which APs transmit, and level II assigns stations—to learn effective C-SR groupings online, with $UCB$ emerging as the most robust choice. A central controller deployment is analyzed, and rewards are defined by the total effective data rate across concurrent transmissions; the approach is evaluated against multiple MAB variants (including $\epsilon$-greedy, Thompson sampling, Softmax, and $UCB$) and baselines, showing rapid convergence and adaptability to topology changes. The results demonstrate the feasibility and benefits of ML-driven MAPC for 802.11bn, supported by an open-source simulator that facilitates further research in dense wireless networks.
Abstract
Coordination among multiple access points (APs) is integral to IEEE 802.11bn (Wi-Fi 8) for managing contention in dense networks. This letter explores the benefits of Coordinated Spatial Reuse (C-SR) and proposes the use of reinforcement learning to optimize C-SR group selection. We develop a hierarchical multi-armed bandit (MAB) framework that efficiently selects APs for simultaneous transmissions across various network topologies, demonstrating reinforcement learning's promise in Wi-Fi settings. Among several MAB algorithms studied, we identify the upper confidence bound (UCB) as particularly effective, offering rapid convergence, adaptability to changes, and sustained performance.
