D3: An Adaptive Reconfigurable Datacenter Network
Johannes Zerwas, Chen Griner, Stefan Schmid, Chen Avin
TL;DR
D3 tackles the problem of static DCN topologies failing to sustain evolving traffic by introducing an adaptive RDCN architecture built on Sirius that jointly tunes topology and packet scheduling. It combines three schedulers—Static, Rotor, and Demand-aware—through dynamic port partitioning and a decentralized control plane to achieve fast reconfiguration and responsive routing. The authors extend Birkhoff–von Neumann decomposition to mixed topologies, present GreedyMixNet for efficient traffic partitioning, and validate with packet-level simulations showing up to ~$15\%$ throughput gains and competitive flow completion times. The work demonstrates a practical, cost-aware pathway to higher DCN throughput and responsiveness, with analytical backing and empirical evidence across realistic workloads.
Abstract
The explosively growing communication traffic in datacenters imposes increasingly stringent performance requirements on the underlying networks. Over the last years, researchers have developed innovative optical switching technologies that enable reconfigurable datacenter networks (RCDNs) which support very fast topology reconfigurations. This paper presents D3, a novel and feasible RDCN architecture that improves throughput and flow completion time. D3 quickly and jointly adapts its links and packet scheduling toward the evolving demand, combining both demand-oblivious and demand-aware behaviors when needed. D3 relies on a decentralized network control plane supporting greedy, integrated-multihop, IP-based routing, allowing to react, quickly and locally, to topological changes without overheads. A rack-local synchronization and transport layer further support fast network adjustments. Moreover, we argue that D3 can be implemented using the recently proposed Sirius architecture (SIGCOMM 2020). We report on an extensive empirical evaluation using packet-level simulations. We find that D3 improves throughput by up to 15% and preserves competitive flow completion times compared to the state of the art. We further provide an analytical explanation of the superiority of D3, introducing an extension of the well-known Birkhoff-von Neumann decomposition, which may be of independent interest.
