Bayesian Optimization for Non-Cooperative Game-Based Radio Resource Management
Yunchuan Zhang, Jiechen Chen, Junshuo Liu, Robert C. Qiu
TL;DR
This work addresses stable spectrum sharing in multi-base-station networks where utilities are costly and black-box. It introduces PPR-UCB, a Bayesian optimization policy that leverages martingale-based prior-posterior ratio confidence sets to approximate pure Nash equilibria in non-cooperative downlink power-control games. By modeling each base station's utility with Gaussian processes and deriving anytime-valid confidence sets, the method achieves data-efficient convergence to near-equilibrium configurations. Experimental results in a multi-cell MIMO setting demonstrate improved efficiency and scalability over existing Bayesian and RL-based approaches, highlighting practical potential for coordinated resource management.
Abstract
Radio resource management in modern cellular networks often calls for the optimization of complex utility functions that are potentially conflicting between different base stations (BSs). Coordinating the resource allocation strategies efficiently across BSs to ensure stable network service poses significant challenges, especially when each utility is accessible only via costly, black-box evaluations. This paper considers formulating the resource allocation among spectrum sharing BSs as a non-cooperative game, with the goal of aligning their allocation incentives toward a stable outcome. To address this challenge, we propose PPR-UCB, a novel Bayesian optimization (BO) strategy that learns from sequential decision-evaluation pairs to approximate pure Nash equilibrium (PNE) solutions. PPR-UCB applies martingale techniques to Gaussian process (GP) surrogates and constructs high probability confidence bounds for utilities uncertainty quantification. Experiments on downlink transmission power allocation in a multi-cell multi-antenna system demonstrate the efficiency of PPR-UCB in identifying effective equilibrium solutions within a few data samples.
