Table of Contents
Fetching ...

Scalable Video Conferencing Using SDN Principles

Oliver Michel, Satadal Sengupta, Hyojoon Kim, Ravi Netravali, Jennifer Rexford

TL;DR

This work tackles the scalability bottleneck of video-conferencing SFUs by rethinking their architecture through SDN-inspired principles. It introduces Scallop, a three‑tier SFU design with aHardware data plane on a programmable switch (Tofino2) handling packet replication, forwarding, and per‑receiver rate adaptation, and a software control plane managing signaling, connectivity, and bandwidth estimation via REMB, plus a switch agent for intelligent control. The approach achieves dramatic improvements over traditional software SFUs, including up to $7$–$210\times$ more concurrent meetings and roughly $26\times$ lower SFU‑induced latency, with a data plane that processes the vast majority of traffic ($\approx 96\%$ of packets) directly in hardware. These results imply that offloading latency‑critical media processing to specialized hardware can dramatically scale VCAs while preserving WebRTC semantics, paving the way for large‑scale deployments and more predictable QoE.

Abstract

Video-conferencing applications face an unwavering surge in traffic, stressing their underlying infrastructure in unprecedented ways. This paper rethinks the key building block for conferencing infrastructures -- selective forwarding units (SFUs). SFUs relay and adapt media streams between participants and, today, run in software on general-purpose servers. Our main insight, discerned from dissecting the operation of production SFU servers, is that SFUs largely mimic traditional packet-processing operations such as dropping and forwarding. Guided by this, we present Scallop, an SDN-inspired SFU that decouples video-conferencing applications into a hardware-based data plane for latency-sensitive and frequent media operations, and a software control plane for the (infrequent) remaining tasks, such as analyzing feedback signals. Our Tofino-based implementation fully supports WebRTC and delivers 7-210 times improved scaling over a 32-core commodity server, while reaping performance improvements by cutting forwarding-induced latency by 26 times.

Scalable Video Conferencing Using SDN Principles

TL;DR

This work tackles the scalability bottleneck of video-conferencing SFUs by rethinking their architecture through SDN-inspired principles. It introduces Scallop, a three‑tier SFU design with aHardware data plane on a programmable switch (Tofino2) handling packet replication, forwarding, and per‑receiver rate adaptation, and a software control plane managing signaling, connectivity, and bandwidth estimation via REMB, plus a switch agent for intelligent control. The approach achieves dramatic improvements over traditional software SFUs, including up to more concurrent meetings and roughly lower SFU‑induced latency, with a data plane that processes the vast majority of traffic ( of packets) directly in hardware. These results imply that offloading latency‑critical media processing to specialized hardware can dramatically scale VCAs while preserving WebRTC semantics, paving the way for large‑scale deployments and more predictable QoE.

Abstract

Video-conferencing applications face an unwavering surge in traffic, stressing their underlying infrastructure in unprecedented ways. This paper rethinks the key building block for conferencing infrastructures -- selective forwarding units (SFUs). SFUs relay and adapt media streams between participants and, today, run in software on general-purpose servers. Our main insight, discerned from dissecting the operation of production SFU servers, is that SFUs largely mimic traditional packet-processing operations such as dropping and forwarding. Guided by this, we present Scallop, an SDN-inspired SFU that decouples video-conferencing applications into a hardware-based data plane for latency-sensitive and frequent media operations, and a software control plane for the (infrequent) remaining tasks, such as analyzing feedback signals. Our Tofino-based implementation fully supports WebRTC and delivers 7-210 times improved scaling over a 32-core commodity server, while reaping performance improvements by cutting forwarding-induced latency by 26 times.

Paper Structure

This paper contains 30 sections, 25 figures, 3 tables.

Figures (25)

  • Figure 1: VCA Architectures: P2P vs. SFU.
  • Figure 2: Number of media streams per meeting in campus trace.
  • Figure 3: Video jitter while adding participants to the SFU.
  • Figure 4: Video frame rate while adding participants to the SFU.
  • Figure 5: SFU Design Choices.
  • ...and 20 more figures