Loss-tolerant neural video codec aware congestion control for real time video communication
Zhengxu Xia, Hanchen Li, Junchen Jiang
TL;DR
The paper addresses the challenge of training reinforcement-learning-based congestion control for real-time video communication in a safe yet efficient manner. It introduces NVC-CC, an online RL CC that exploits the loss-tolerance of neural video codecs (notably GRACE) to avoid heavy safeguard policies, thereby accelerating learning and improving QoE. Empirical results show a 41% reduction in training time and QoE gains (0.3–1.6 dB mean SSIM, significant reductions in tail latency and stalls) across synthetic and some real traces, though real-world generalization remains a challenge. The work highlights a practical avenue for deploying adaptive RTC CCs that synergize with loss-resilient codecs, while noting limitations related to codec dependency, sim-to-real transfer, and fairness and scalability concerns.
Abstract
Because of reinforcement learning's (RL) ability to automatically create more adaptive controlling logics beyond the hand-crafted heuristics, numerous effort has been made to apply RL to congestion control (CC) design for real time video communication (RTC) applications and has successfully shown promising benefits over the rule-based RTC CCs. Online reinforcement learning is often adopted to train the RL models so the models can directly adapt to real network environments. However, its trail-and-error manner can also cause catastrophic degradation of the quality of experience (QoE) of RTC application at run time. Thus, safeguard strategies such as falling back to hand-crafted heuristics can be used to run along with RL models to guarantee the actions explored in the training sensible, despite that these safeguard strategies interrupt the learning process and make it more challenging to discover optimal RL policies. The recent emergence of loss-tolerant neural video codecs (NVC) naturally provides a layer of protection for the online learning of RL-based congestion control because of its resilience to packet losses, but such packet loss resilience have not been fully exploited in prior works yet. In this paper, we present a reinforcement learning (RL) based congestion control which can be aware of and takes advantage of packet loss tolerance characteristic of NVCs via reward in online RL learning. Through extensive evaluation on various videos and network traces in a simulated environment, we demonstrate that our NVC-aware CC running with the loss-tolerant NVC reduces the training time by 41\% compared to other prior RL-based CCs. It also boosts the mean video quality by 0.3 to 1.6dB, lower the tail frame delay by 3 to 200ms, and reduces the video stalls by 20\% to 77\% in comparison with other baseline RTC CCs.
