Table of Contents
Fetching ...

Vidaptive: Efficient and Responsive Rate Control for Real-Time Video on Variable Networks

Pantea Karimi, Sadjad Fouladi, Vibhaalakshmi Sivaraman, Mohammad Alizadeh

TL;DR

Vidaptive addresses the slow responsiveness of traditional video rate controllers to network variability by decoupling encoder output from on-wire transmission and injecting dummy padding to emulate backlogged flow controlled by a delay-based congestion controller. It introduces an online encoder-rate controller that tunes the encoder target bitrate through a percentile-based headroom computation, ensuring alignment with the congestion-controller sending rate while managing frame latency. Across WebRTC experiments on cellular and Pantheon traces, Vidaptive delivers substantial gains in link utilization, video bitrate, and perceptual quality (VMAF/SSIM/PSNR) and reduces tail latency, all without requiring encoder modifications. This practical, encoder-agnostic approach offers a viable path to tighter bandwidth utilization for real-time video on highly variable networks.

Abstract

Real-time video streaming relies on rate control mechanisms to adapt video bitrate to network capacity while maintaining high utilization and low delay. However, the current video rate controllers, such as Google Congestion Control (GCC), are very slow to respond to network changes, leading to link under-utilization and latency spikes. While recent delay-based congestion control algorithms promise high efficiency and rapid adaptation to variable conditions, low-latency video applications have been unable to adopt these schemes due to the intertwined relationship between video encoders and rate control in current systems. This paper introduces Vidaptive, a new rate control mechanism designed for low-latency video applications. Vidaptive decouples packet transmission decisions from encoder output, injecting ``dummy'' padding traffic as needed to treat video streams akin to backlogged flows controlled by a delay-based congestion controller. Vidaptive then adapts the target bitrate of the encoder based on delay measurements to align the video bitrate with the congestion controller's sending rate. Our evaluations atop Google's implementation of WebRTC show that, across a set of cellular traces, Vidaptive achieves ~1.5x higher video bitrate and 1.4 dB higher SSIM, 1.3 dB higher PSNR, and 40% higher VMAF, and it reduces 95th-percentile frame latency by 2.2 s with a slight 17 ms increase in median frame latency.

Vidaptive: Efficient and Responsive Rate Control for Real-Time Video on Variable Networks

TL;DR

Vidaptive addresses the slow responsiveness of traditional video rate controllers to network variability by decoupling encoder output from on-wire transmission and injecting dummy padding to emulate backlogged flow controlled by a delay-based congestion controller. It introduces an online encoder-rate controller that tunes the encoder target bitrate through a percentile-based headroom computation, ensuring alignment with the congestion-controller sending rate while managing frame latency. Across WebRTC experiments on cellular and Pantheon traces, Vidaptive delivers substantial gains in link utilization, video bitrate, and perceptual quality (VMAF/SSIM/PSNR) and reduces tail latency, all without requiring encoder modifications. This practical, encoder-agnostic approach offers a viable path to tighter bandwidth utilization for real-time video on highly variable networks.

Abstract

Real-time video streaming relies on rate control mechanisms to adapt video bitrate to network capacity while maintaining high utilization and low delay. However, the current video rate controllers, such as Google Congestion Control (GCC), are very slow to respond to network changes, leading to link under-utilization and latency spikes. While recent delay-based congestion control algorithms promise high efficiency and rapid adaptation to variable conditions, low-latency video applications have been unable to adopt these schemes due to the intertwined relationship between video encoders and rate control in current systems. This paper introduces Vidaptive, a new rate control mechanism designed for low-latency video applications. Vidaptive decouples packet transmission decisions from encoder output, injecting ``dummy'' padding traffic as needed to treat video streams akin to backlogged flows controlled by a delay-based congestion controller. Vidaptive then adapts the target bitrate of the encoder based on delay measurements to align the video bitrate with the congestion controller's sending rate. Our evaluations atop Google's implementation of WebRTC show that, across a set of cellular traces, Vidaptive achieves ~1.5x higher video bitrate and 1.4 dB higher SSIM, 1.3 dB higher PSNR, and 40% higher VMAF, and it reduces 95th-percentile frame latency by 2.2 s with a slight 17 ms increase in median frame latency.
Paper Structure (22 sections, 3 equations, 23 figures, 1 table)

This paper contains 22 sections, 3 equations, 23 figures, 1 table.

Figures (23)

  • Figure 1: Utilization, frame quality, and latencies of Copa on a backlogged flow, GCC on a video flow, and Copa + Vidaptive on a video flow. GCC is very slow to match the available capacity and under-utilizes the link in the steady state. Copa + Vidaptive responds much faster to link variations and is similar to Copa's performance on a backlogged flow.
  • Figure 2: Video encoder's response to a time-varying "Target" input bitrate. "Achieved" reflects the encoder's output rate. The encoder is slow to increase its output rate and exhibits lots of variation around the average output rate in the steady state.
  • Figure 3: Vidaptive Design. Vidaptive uses a window-based Congestion Controller, Pacer, and a new Dummy Generator to decouple the rate at which traffic is sent on the wire from the encoder. The Encoder Rate Controller monitors frame delays to trigger latency safeguards and picks a new target bitrate based on the discrepancy between the CC-Rate and the video encoder's current bitrate.
  • Figure 4: Finding target bitrate fraction ($\alpha$). Given a set of frame service time samples (left), whose $\lambda^{th}$ percentile and outliers higher than $P = \Delta$ are shown in red, we evaluate find $\alpha$ that matches the counterfactual $\lambda^{th}$ percentile to $P$ (middle). We update the counterfactual values of frame service time with $\alpha$ to have fewer outliers above $P$ (right).
  • Figure 5: $\alpha$'s response to link and video encoder variations. $\alpha$ picks lower values (lower target bitrate) when the link capacity or encoder output varies significantly to maintain good control over the frame service times. $d_i$ denotes the frame service time in milliseconds.
  • ...and 18 more figures