Mowgli: Passively Learned Rate Control for Real-Time Video
Neil Agarwal, Rui Pan, Francis Y. Yan, Ravi Netravali
TL;DR
The paper tackles the practicality challenge of data-driven rate control for real-time video by proposing to learn from existing production telemetry logs rather than online exploration, thereby avoiding QoE disruptions. It introduces Mowgli, an offline log-based framework that uses Soft Actor-Critic with conservative and distributional RL to learn a rate-control policy from GCC logs, represented by 1-second state windows and a reward combining throughput, delay, and loss. Across emulated and real networks, Mowgli increases average video bitrate by 15–39% and reduces freezes by 60–100% relative to GCC, while approaching online RL performance without QoE degradations during training. The approach emphasizes generalization, deployment practicality, and robustness to environmental noise, offering a viable path to harness data-driven gains in production video conferencing systems.
Abstract
Rate control algorithms are at the heart of video conferencing platforms, determining target bitrates that match dynamic network characteristics for high quality. Recent data-driven strategies have shown promise for this challenging task, but the performance degradation they introduce during training has been a nonstarter for many production services, precluding adoption. This paper aims to bolster the practicality of data-driven rate control by presenting an alternative avenue for experiential learning: leveraging purely existing telemetry logs produced by the incumbent algorithm in production. We observe that these logs contain effective decisions, although often at the wrong times or in the wrong order. To realize this approach despite the inherent uncertainty that log-based learning brings (i.e., lack of feedback for new decisions), our system, Mowgli, combines a variety of robust learning techniques (i.e., conservatively reasoning about alternate behavior to minimize risk and using a richer model formulation to account for environmental noise). Across diverse networks (emulated and real-world), Mowgli outperforms the widely deployed GCC algorithm, increasing average video bitrates by 15-39% while reducing freeze rates by 60-100%.
