Table of Contents
Fetching ...

Cost-effective and performant virtual WANs with CORNIFER

Anjali, Rachee Singh, Michael M. Swift

TL;DR

The paper tackles the problem that virtual WAN performance and cost depend strongly on hub topology. It introduces Corn-ifer, a MILP-based tool that uses global latency measurements to optimally place hubs and explore performance-, cost-, and Pareto-front topologies, including a SLO-aware extension. Key findings show roughly 26% latency improvement and 28% hub reductions, with the mean_k heuristic offering near-optimal performance at modest cost, scalable to enterprises with hundreds of branches. The work has practical impact by enabling cost-effective, high-performance virtual WAN deployments and can be integrated into cloud-provider portals.

Abstract

Virtual wide-area networks (WANs) are WAN-as-a-service cloud offerings that aim to bring the performance benefits of dedicated wide-area interconnects to enterprise customers. In this work, we show that the topology of a virtual WAN can render it both performance and cost inefficient. We develop Cornifer, a tool that designs virtual WAN topologies by deciding the number of virtual WAN nodes and their location in the cloud to minimize connection latency at low cost to enterprises. By leveraging millions of latency measurements from vantage points across the world to cloud points of presence, Cornifer designs virtual WAN topologies that improve weighted client latency by 26% and lower cost by 28% compared to the state-of-the-art. Cornifer identifies virtual WAN topologies at the Pareto frontier of the deployment cost vs. connection latency trade-off and proposes a heuristic for automatic selection of Pareto-optimal virtual WAN topologies for enterprises.

Cost-effective and performant virtual WANs with CORNIFER

TL;DR

The paper tackles the problem that virtual WAN performance and cost depend strongly on hub topology. It introduces Corn-ifer, a MILP-based tool that uses global latency measurements to optimally place hubs and explore performance-, cost-, and Pareto-front topologies, including a SLO-aware extension. Key findings show roughly 26% latency improvement and 28% hub reductions, with the mean_k heuristic offering near-optimal performance at modest cost, scalable to enterprises with hundreds of branches. The work has practical impact by enabling cost-effective, high-performance virtual WAN deployments and can be integrated into cloud-provider portals.

Abstract

Virtual wide-area networks (WANs) are WAN-as-a-service cloud offerings that aim to bring the performance benefits of dedicated wide-area interconnects to enterprise customers. In this work, we show that the topology of a virtual WAN can render it both performance and cost inefficient. We develop Cornifer, a tool that designs virtual WAN topologies by deciding the number of virtual WAN nodes and their location in the cloud to minimize connection latency at low cost to enterprises. By leveraging millions of latency measurements from vantage points across the world to cloud points of presence, Cornifer designs virtual WAN topologies that improve weighted client latency by 26% and lower cost by 28% compared to the state-of-the-art. Cornifer identifies virtual WAN topologies at the Pareto frontier of the deployment cost vs. connection latency trade-off and proposes a heuristic for automatic selection of Pareto-optimal virtual WAN topologies for enterprises.
Paper Structure (30 sections, 5 equations, 18 figures, 8 tables)

This paper contains 30 sections, 5 equations, 18 figures, 8 tables.

Figures (18)

  • Figure 1: Virtual WANs are cloud overlays. An example topology consists of hubs $A$, $B$ and $C$ that are network gateways providing regional entry points into the virtual WAN.
  • Figure 2: \ref{['fig:vwan-bad']} Problematic virtualized WAN topology. Dotted lines in the cloud network demarcate geographical regions (e.g., S. America, N. America, W. Europe). \ref{['fig:latency_ratio']} shows 20% of the metros take sub-optimal routes.
  • Figure 3: Inter-domain and intra-domain latency from enterprise branch offices to virtual WANs. $L_2$ and $L_2\prime$ measure the latency from edge PoP to the nearest datacenter.
  • Figure 4: Corn-ifer Design.
  • Figure 5: \ref{['fig:non_default_count']} shows % of samples a non-default PoP performed better than geo-default. \ref{['fig:anycast_non_default_count']} shows the number of times a faster non-default PoP was the same as the anycast-default for metros where the geo-default was different from the anycast-default. \ref{['fig:non_default_count_24_hours']} shows performance of non-default PoPs remains stable for a long time.
  • ...and 13 more figures