Table of Contents
Fetching ...

Saving Private WAN: Using Internet Paths to Offload WAN Traffic in Conferencing Services

Bhaskar Kataria, Palak LNU, Rahul Bothra, Rohan Gandhi, Debopam Bhattacherjee, Venkata N. Padmanabhan, Irena Atov, Sriraam Ramakrishnan, Somesh Chaturmohta, Chakri Kotipalli, Rui Liang, Ken Sueda, Xin He, Kevin Hinton

TL;DR

This paper tackles the cost and performance challenge of global video conferencing by offloading a portion of WAN traffic to cheaper Internet routing. It combines a large-scale measurement study with two systems: Titan, which incrementally shifts traffic to Internet paths in production, and Titan-Next, a research prototype that jointly optimizes MP DC placement and routing to minimize WAN peak usage while preserving end-to-end latency. Results show up to 61% reduction in WAN peak bandwidth and substantial operational cost savings, with prediction-driven planning achieving strong accuracy and manageable migrations. The work provides practical, production-tested approaches and opens measurement data to the community, advancing scalable, cost-aware conferencing infrastructure.

Abstract

Large-scale video conferencing services incur significant network cost while serving surging global demands. Our work systematically explores the opportunity to offload a fraction of this traffic to the Internet, a cheaper routing option offered already by cloud providers, from WAN without drop in application performance. First, with a large-scale latency measurement study with 3.5 million data points per day spanning 241K source cities and 21 data centers across the globe, we demonstrate that Internet paths perform comparable to or better than the private WAN for parts of the world (e.g., Europe and North America). Next, we present Titan, a live (12+ months) production system that carefully moves a fraction of the conferencing traffic to the Internet using the above observation. Finally, we propose Titan-Next, a research prototype that jointly assigns the conferencing server and routing option (Internet or WAN) for individual calls. With 5 weeks of production data, we show Titan-Next reduces the sum of peak bandwidth on WAN links that defines the operational network cost by up to 61% compared to state-of-the-art baselines. We will open-source parts of the measurement data.

Saving Private WAN: Using Internet Paths to Offload WAN Traffic in Conferencing Services

TL;DR

This paper tackles the cost and performance challenge of global video conferencing by offloading a portion of WAN traffic to cheaper Internet routing. It combines a large-scale measurement study with two systems: Titan, which incrementally shifts traffic to Internet paths in production, and Titan-Next, a research prototype that jointly optimizes MP DC placement and routing to minimize WAN peak usage while preserving end-to-end latency. Results show up to 61% reduction in WAN peak bandwidth and substantial operational cost savings, with prediction-driven planning achieving strong accuracy and manageable migrations. The work provides practical, production-tested approaches and opens measurement data to the community, advancing scalable, cost-aware conferencing infrastructure.

Abstract

Large-scale video conferencing services incur significant network cost while serving surging global demands. Our work systematically explores the opportunity to offload a fraction of this traffic to the Internet, a cheaper routing option offered already by cloud providers, from WAN without drop in application performance. First, with a large-scale latency measurement study with 3.5 million data points per day spanning 241K source cities and 21 data centers across the globe, we demonstrate that Internet paths perform comparable to or better than the private WAN for parts of the world (e.g., Europe and North America). Next, we present Titan, a live (12+ months) production system that carefully moves a fraction of the conferencing traffic to the Internet using the above observation. Finally, we propose Titan-Next, a research prototype that jointly assigns the conferencing server and routing option (Internet or WAN) for individual calls. With 5 weeks of production data, we show Titan-Next reduces the sum of peak bandwidth on WAN links that defines the operational network cost by up to 61% compared to state-of-the-art baselines. We will open-source parts of the measurement data.
Paper Structure (38 sections, 20 figures, 4 tables)

This paper contains 38 sections, 20 figures, 4 tables.

Figures (20)

  • Figure 1: WAN versus Internet routing. Using WAN routing, the traffic from MP exits the WAN closest to the user (cold-potato). Using Internet routing, the traffic exits the WAN closest to the DC (hot-potato).
  • Figure 2: Locations of the $21$ Azure DCs used in the measurements. Orange triangles denote the locations of representative DCs used in Fig.\ref{['fig:measure:latencyheatmap']}.
  • Figure 3: Comparing latencies for WAN and Internet for 21 Azure DCs in 5 different continents. Negative difference indicates Internet is better. We label latency = RTT. The legends denote the DC locations.
  • Figure 4: Fraction (F) of times Internet provides better or comparable (within $10$ msec) latency compared to WAN. SA denotes South Africa and US denotes the United States. Darker shade means higher F.
  • Figure 5: Difference in $F$ between different granularities and granularity = country. Cr indicates country.
  • ...and 15 more figures