Table of Contents
Fetching ...

Evaluation of TCP Congestion Control for Public High-Performance Wide-Area Networks

Fatih Berkay Sarpkaya, Andrea Francini, Bilgehan Erman, Shivendra Panwar

Abstract

Practitioners of a growing number of scientific and artificial-intelligence (AI) applications use High-Performance Wide-Area Networks (HP-WANs) for moving massive data sets between remote facilities. Accurate prediction of the flow completion time (FCT) is essential in these data-transfer workflows because compute and storage resources are tightly scheduled and expensive. We assess the viability of three TCP congestion control algorithms (CUBIC, BBRv1, and BBRv3) for massive data transfers over public HP-WANs, where limited control of critical data-path parameters precludes the use of Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCEv2), which is known to outperform TCP in private HP-WANs. Extensive experiments on the FABRIC testbed indicate that the configuration control limitations can also hinder TCP, especially through microburst-induced packet losses. Under these challenging conditions, we show that the highest FCT predictability is achieved by combination of BBRv1 with the application of traffic shaping before the HP-WAN entry points.

Evaluation of TCP Congestion Control for Public High-Performance Wide-Area Networks

Abstract

Practitioners of a growing number of scientific and artificial-intelligence (AI) applications use High-Performance Wide-Area Networks (HP-WANs) for moving massive data sets between remote facilities. Accurate prediction of the flow completion time (FCT) is essential in these data-transfer workflows because compute and storage resources are tightly scheduled and expensive. We assess the viability of three TCP congestion control algorithms (CUBIC, BBRv1, and BBRv3) for massive data transfers over public HP-WANs, where limited control of critical data-path parameters precludes the use of Remote Direct Memory Access (RDMA) over Converged Ethernet (RoCEv2), which is known to outperform TCP in private HP-WANs. Extensive experiments on the FABRIC testbed indicate that the configuration control limitations can also hinder TCP, especially through microburst-induced packet losses. Under these challenging conditions, we show that the highest FCT predictability is achieved by combination of BBRv1 with the application of traffic shaping before the HP-WAN entry points.
Paper Structure (10 sections, 5 figures)

This paper contains 10 sections, 5 figures.

Figures (5)

  • Figure 1: Logical representation of the massive-transfer data path.
  • Figure 2: Slice topology; L2PTP VCs terminate on router VMs R1--R4.
  • Figure 3: Goodput performance of router VM configuration options.
  • Figure 4: 10 TCP CUBIC flows with DPDK shaping at 80 Gb/s.
  • Figure 5: BBRv1 and BBRv3 performance with different configurations on the routers before the L2PTP tunnel.