Static Deep Q-learning for Green Downlink C-RAN
Yuchao Chang, Hongli Wang, Wen Chen, Yonghui Li, Naofal Al-Dhahir
TL;DR
This work tackles energy efficiency in downlink C-RAN by jointly maximizing throughput and reducing power under UE throughput constraints ${\widetilde{R}_{b,u}}$. It introduces Static Deep Q-Learning (SDQL), a multi-Q-table reinforcement learning method that models power-offset decisions as an MDP with state defined by RSRP and actions as RRH power reductions ${\Delta P_{b,u}}$, updating Q-values via a standard temporal-difference rule with a per-UE Q-table. Through extensive simulations, SDQL achieves higher average power reduction than activation or sleep baselines while maintaining UE satisfaction and exhibiting low computational complexity. The method advances green wireless network design by enabling interference-aware, data-driven power management in C-RAN with scalable, per-UE learning dynamics.
Abstract
Power saving is a main pillar in the operation of wireless communication systems. In this paper, we investigate cloud radio access network (C-RAN) capability to reduce power consumption based on the user equipment (UE) requirement. Aiming to save the long-term C-RAN energy consumption, an optimization problem is formulated to manage the downlink power without degrading the UE requirement by designing the power offset parameter. Considering stochastic traffic arrivals at UEs, we first formulate the problem as a Markov decision process (MDP) and then set up a dual objective optimization problem in terms of the downlink throughput and power. To solve this optimization problem, we develop a novel static deep Q-learning (SDQL) algorithm to maximize the downlink throughput and minimize the downlink power. In our proposed algorithm, we design multi-Q-tables to simultaneously optimize power reductions of activated RRHs by assigning one Q-table for each UE. To maximize the accumulative reward in terms of the downlink throughput loss and power reduction, our proposed algorithm performs power reductions of activated RRHs through continuous environmental interactions. Simulation results1 show that our proposed algorithm enjoys a superior average power reduction compared to the activation and sleep schemes, and enjoys a low computational complexity.
