Table of Contents
Fetching ...

D3: An Adaptive Reconfigurable Datacenter Network

Johannes Zerwas, Chen Griner, Stefan Schmid, Chen Avin

TL;DR

D3 tackles the problem of static DCN topologies failing to sustain evolving traffic by introducing an adaptive RDCN architecture built on Sirius that jointly tunes topology and packet scheduling. It combines three schedulers—Static, Rotor, and Demand-aware—through dynamic port partitioning and a decentralized control plane to achieve fast reconfiguration and responsive routing. The authors extend Birkhoff–von Neumann decomposition to mixed topologies, present GreedyMixNet for efficient traffic partitioning, and validate with packet-level simulations showing up to ~$15\%$ throughput gains and competitive flow completion times. The work demonstrates a practical, cost-aware pathway to higher DCN throughput and responsiveness, with analytical backing and empirical evidence across realistic workloads.

Abstract

The explosively growing communication traffic in datacenters imposes increasingly stringent performance requirements on the underlying networks. Over the last years, researchers have developed innovative optical switching technologies that enable reconfigurable datacenter networks (RCDNs) which support very fast topology reconfigurations. This paper presents D3, a novel and feasible RDCN architecture that improves throughput and flow completion time. D3 quickly and jointly adapts its links and packet scheduling toward the evolving demand, combining both demand-oblivious and demand-aware behaviors when needed. D3 relies on a decentralized network control plane supporting greedy, integrated-multihop, IP-based routing, allowing to react, quickly and locally, to topological changes without overheads. A rack-local synchronization and transport layer further support fast network adjustments. Moreover, we argue that D3 can be implemented using the recently proposed Sirius architecture (SIGCOMM 2020). We report on an extensive empirical evaluation using packet-level simulations. We find that D3 improves throughput by up to 15% and preserves competitive flow completion times compared to the state of the art. We further provide an analytical explanation of the superiority of D3, introducing an extension of the well-known Birkhoff-von Neumann decomposition, which may be of independent interest.

D3: An Adaptive Reconfigurable Datacenter Network

TL;DR

D3 tackles the problem of static DCN topologies failing to sustain evolving traffic by introducing an adaptive RDCN architecture built on Sirius that jointly tunes topology and packet scheduling. It combines three schedulers—Static, Rotor, and Demand-aware—through dynamic port partitioning and a decentralized control plane to achieve fast reconfiguration and responsive routing. The authors extend Birkhoff–von Neumann decomposition to mixed topologies, present GreedyMixNet for efficient traffic partitioning, and validate with packet-level simulations showing up to ~ throughput gains and competitive flow completion times. The work demonstrates a practical, cost-aware pathway to higher DCN throughput and responsiveness, with analytical backing and empirical evidence across realistic workloads.

Abstract

The explosively growing communication traffic in datacenters imposes increasingly stringent performance requirements on the underlying networks. Over the last years, researchers have developed innovative optical switching technologies that enable reconfigurable datacenter networks (RCDNs) which support very fast topology reconfigurations. This paper presents D3, a novel and feasible RDCN architecture that improves throughput and flow completion time. D3 quickly and jointly adapts its links and packet scheduling toward the evolving demand, combining both demand-oblivious and demand-aware behaviors when needed. D3 relies on a decentralized network control plane supporting greedy, integrated-multihop, IP-based routing, allowing to react, quickly and locally, to topological changes without overheads. A rack-local synchronization and transport layer further support fast network adjustments. Moreover, we argue that D3 can be implemented using the recently proposed Sirius architecture (SIGCOMM 2020). We report on an extensive empirical evaluation using packet-level simulations. We find that D3 improves throughput by up to 15% and preserves competitive flow completion times compared to the state of the art. We further provide an analytical explanation of the superiority of D3, introducing an extension of the well-known Birkhoff-von Neumann decomposition, which may be of independent interest.
Paper Structure (25 sections, 2 theorems, 12 equations, 18 figures, 6 tables, 4 algorithms)

This paper contains 25 sections, 2 theorems, 12 equations, 18 figures, 6 tables, 4 algorithms.

Key Result

Theorem 5.1

For any saturated demand matrix $M$,

Figures (18)

  • Figure 1: Overview of D3's architecture. The solid lines indicate bidirectional links. ToRs connect in a two layer leaf-spine topology to passive gratings. A tuneable laser at the transceivers can adjust the wavelength to select an egress port. There are three links (ports) scheduler classes Static, Demand-aware, and Rotor.
  • Figure 2: Motivation for D3: the optimal dynamic topology resp. link scheduling depends on the demand matrix, which may change over time.
  • Figure 3: An example of D3's backbone sub-topology with 8 ToRs: A de Bruijn topology established by two Static ports and one Demand-aware port. Top right: two matchings for the Static ports, bottom right: the matching of the Demand-aware ports (for clarity, drawing only two links in the topology).
  • Figure 4: Data- and control plane for hosts and ToRs.
  • Figure 5: The average goodput over $1$s of simulation. The heatmap compares topology configurations ($Y$-axis) and traffic shares ($X$-axis) for different loads $L$. Values are normalized to the offered traffic. The best result is highlighted with a thick border. The white regions are excluded for clarity, they are far from optimal.
  • ...and 13 more figures

Theorems & Definitions (4)

  • Definition 1: de Bruijn topology
  • Theorem 5.1
  • Claim 1
  • Theorem C.1