Table of Contents
Fetching ...

Challenger-Based Combinatorial Bandits for Subcarrier Selection in OFDM Systems

Mohsen Amiri, V Venktesh, Sindri Magnússon

TL;DR

The paper reframes OFDM subcarrier selection as a top-$m$ arm identification problem in stochastic linear bandits and introduces Challenger Champion Sampling (CCS), a shortlist-based, latency-aware approach that focuses on informative champion–challenger comparisons. CCS maintains a small set of current champions and a rotating shortlist of challengers, uses a gap-index to guide sampling, and employs a largest-variance rule to choose measurements, achieving $(\varepsilon,m,\delta)$-PAC guarantees. Theoretical analysis provides a high-probability bound on stopping time via a problem-dependent complexity term, while simulations in a realistic OFDM downlink show CCS yields orders-of-magnitude reductions in gap-index computations and runtime with near-perfect top-$m$ accuracy. The method offers a practical, tunable speed–accuracy trade-off for AI-enabled communication systems and lays groundwork for extensions to RB contiguity, fairness, and hardware-in-the-loop evaluation.

Abstract

This paper investigates the identification of the top-m user-scheduling sets in multi-user MIMO downlink, which is cast as a combinatorial pure-exploration problem in stochastic linear bandits. Because the action space grows exponentially, exhaustive search is infeasible. We therefore adopt a linear utility model to enable efficient exploration and reliable selection of promising user subsets. We introduce a gap-index framework that maintains a shortlist of current estimates of champion arms (top-m sets) and a rotating shortlist of challenger arms that pose the greatest threat to the champions. This design focuses on measurements that yield the most informative gap-index-based comparisons, resulting in significant reductions in runtime and computation compared to state-of-the-art linear bandit methods, with high identification accuracy. The method also exposes a tunable trade-off between speed and accuracy. Simulations on a realistic OFDM downlink show that shortlist-driven pure exploration makes online, measurement-efficient subcarrier selection practical for AI-enabled communication systems.

Challenger-Based Combinatorial Bandits for Subcarrier Selection in OFDM Systems

TL;DR

The paper reframes OFDM subcarrier selection as a top- arm identification problem in stochastic linear bandits and introduces Challenger Champion Sampling (CCS), a shortlist-based, latency-aware approach that focuses on informative champion–challenger comparisons. CCS maintains a small set of current champions and a rotating shortlist of challengers, uses a gap-index to guide sampling, and employs a largest-variance rule to choose measurements, achieving -PAC guarantees. Theoretical analysis provides a high-probability bound on stopping time via a problem-dependent complexity term, while simulations in a realistic OFDM downlink show CCS yields orders-of-magnitude reductions in gap-index computations and runtime with near-perfect top- accuracy. The method offers a practical, tunable speed–accuracy trade-off for AI-enabled communication systems and lays groundwork for extensions to RB contiguity, fairness, and hardware-in-the-loop evaluation.

Abstract

This paper investigates the identification of the top-m user-scheduling sets in multi-user MIMO downlink, which is cast as a combinatorial pure-exploration problem in stochastic linear bandits. Because the action space grows exponentially, exhaustive search is infeasible. We therefore adopt a linear utility model to enable efficient exploration and reliable selection of promising user subsets. We introduce a gap-index framework that maintains a shortlist of current estimates of champion arms (top-m sets) and a rotating shortlist of challenger arms that pose the greatest threat to the champions. This design focuses on measurements that yield the most informative gap-index-based comparisons, resulting in significant reductions in runtime and computation compared to state-of-the-art linear bandit methods, with high identification accuracy. The method also exposes a tunable trade-off between speed and accuracy. Simulations on a realistic OFDM downlink show that shortlist-driven pure exploration makes online, measurement-efficient subcarrier selection practical for AI-enabled communication systems.

Paper Structure

This paper contains 14 sections, 3 theorems, 16 equations, 2 figures, 1 table, 1 algorithm.

Key Result

Lemma 1

If the SNR estimation error in dB, $\xi_{i,t}$, is sub-Gaussian with proxy variance $\sigma_\xi^2$, then the induced reward noise is sub-Gaussian with proxy variance $\sigma_\eta^2 \;\le\; (\tfrac{\ln 10}{10\ln 2})^{\!2}\sigma_\xi^2$, uniformly over $\gamma_i \ge 0$.

Figures (2)

  • Figure 1: Top-$m$ arm identification runtime comparison between CCS, LinGIFA (LGI), LinUGapE (LUG) and LinGapE (LG).
  • Figure 2: Variation in correctness and latency of CCS with $|C_t|$.

Theorems & Definitions (6)

  • Lemma 1: Sub-Gaussian reward noise under dB-SNR errors
  • proof
  • Definition 1
  • Theorem 1
  • Lemma 2
  • proof