Reinforcement Learning Based Goodput Maximization with Quantized Feedback in URLLC
Hasan Basri Celebi, Mikael Skoglund
TL;DR
This work targets goodput optimization in URLLC under quantized CSI feedback for time-varying channels. It introduces a two-part framework: a learning-based estimator using ten empirical moments with XGBoost to track the Rician-$K$ factor, and a reinforcement-learning-based scheme to adapt quantization levels and rates via Q-learning within an MDP, aided by dual feedback channels. The method explicitly handles varying channel statistics by predicting $K$ and dynamically updating the feedback policy to maximize $G = r(1-\\epsilon)$, achieving near-optimal goodput as channel conditions drift. Practically, the approach reduces training overhead by keeping learning at the receiver and by using a training feedback channel only when updating rates, enabling responsive URLLC operation with manageable latency. The results demonstrate effective tracking of $K$ and rapid convergence of the RL policy, indicating significant potential for deployment in beyond-5G industrial applications.
Abstract
This paper presents a comprehensive system model for goodput maximization with quantized feedback in Ultra-Reliable Low-Latency Communication (URLLC), focusing on dynamic channel conditions and feedback schemes. The study investigates a communication system, where the receiver provides quantized channel state information to the transmitter. The system adapts its feedback scheme based on reinforcement learning, aiming to maximize goodput while accommodating varying channel statistics. We introduce a novel Rician-$K$ factor estimation technique to enable the communication system to optimize the feedback scheme. This dynamic approach increases the overall performance, making it well-suited for practical URLLC applications where channel statistics vary over time.
