AutoRec: Accelerating Loss Recovery for Live Streaming in a Multi-Supplier Market
Tong Li, Xu Yan, Bo Wu, Cheng Luo, Fuyu Wang, Jiuxiang Zhu, Haoyi Fang, Xinle Du, Ke Xu
TL;DR
AutoRec addresses slow loss recovery in live streaming under multi-supplier CDN constraints by enabling sender-side reinjection of lost packets, guided by an adaptive redundancy framework. It introduces Redundancy Adapter to compute a safe redundancy level under latency, cost, and goodput constraints, and Reinjection Controller to time replica retransmissions during off-modes or opportunistically on-modes, all without client changes and with QUIC integration. Large-scale measurements reveal pervasive retransmission loss and frequent on-off traffic, motivating a design that leverages these patterns to improve timeliness and reduce video freezes, validated through extensive testbed and real-network deployments. Tencent’s global CDN deployment demonstrates practical viability and measurable QoE gains with minimal overhead.
Abstract
Due to the limited permissions for upgrading dualside (i.e., server-side and client-side) loss tolerance schemes from the perspective of CDN vendors in a multi-supplier market, modern large-scale live streaming services are still using the automatic-repeat-request (ARQ) based paradigm for loss recovery, which only requires server-side modifications. In this paper, we first conduct a large-scale measurement study with up to 50 million live streams. We find that loss shows dynamics and live streaming contains frequent on-off mode switching in the wild. We further find that the recovery latency, enlarged by the ubiquitous retransmission loss, is a critical factor affecting live streaming's client-side QoE (e.g., video freezing). We then propose an enhanced recovery mechanism called AutoRec, which can transform the disadvantages of on-off mode switching into an advantage for reducing loss recovery latency without any modifications on the client side. AutoRec allows users to customize overhead tolerance and recovery latency tolerance and adaptively adjusts strategies as the network environment changes to ensure that recovery latency meets user demands whenever possible while keeping overhead under control. We implement AutoRec upon QUIC and evaluate it via testbed and real-world commercial services deployments. The experimental results demonstrate the practicability and profitability of AutoRec.
