Table of Contents
Fetching ...

Revolutionizing Datacenter Networks via Reconfigurable Topologies

Chen Avin, Stefan Schmid

TL;DR

The paper addresses the mismatch between growing datacenter traffic and fixed network topologies by surveying reconfigurable datacenter networks (RDCNs) enabled by optical circuit switches. It introduces a two-dimensional taxonomy (static vs dynamic, demand-oblivious vs demand-aware) and a formal evolving-graph model with a $\Delta$-timeslot reconfiguration abstraction to analyze bandwidth and latency taxes. It surveys representative RDCN designs (e.g., RotorNet, Sirius, Jupiter, ProjecToR, Cerberus) and discusses operational, deployment, and research challenges, complemented by expert video interviews. The work emphasizes topology engineering as a means to tailor network connectivity to traffic structure, potentially enhancing throughput and latency while outlining open problems across control planes, cross-layer integration, and scalable deployment. Overall, RDCNs offer a promising path to meet data-center demands through dynamic topologies, with significant implications for performance, cost, and incremental deployment strategies.

Abstract

With the popularity of cloud computing and data-intensive applications such as machine learning, datacenter networks have become a critical infrastructure for our digital society. Given the explosive growth of datacenter traffic and the slowdown of Moore's law, significant efforts have been made to improve datacenter network performance over the last decade. A particularly innovative solution is reconfigurable datacenter networks (RDCNs): datacenter networks whose topologies dynamically change over time, in either a demand-oblivious or a demand-aware manner. Such dynamic topologies are enabled by recent optical switching technologies and stand in stark contrast to state-of-the-art datacenter network topologies, which are fixed and oblivious to the actual traffic demand. In particular, reconfigurable demand-aware and 'self-adjusting' datacenter networks are motivated empirically by the significant spatial and temporal structures observed in datacenter communication traffic. This paper presents an overview of reconfigurable datacenter networks. In particular, we discuss the motivation for such reconfigurable architectures, review the technological enablers, and present a taxonomy that classifies the design space into two dimensions: static vs. dynamic and demand-oblivious vs. demand-aware. We further present a formal model and discuss related research challenges. Our article comes with complementary video interviews in which three leading experts, Manya Ghobadi, Amin Vahdat, and George Papen, share with us their perspectives on reconfigurable datacenter networks.

Revolutionizing Datacenter Networks via Reconfigurable Topologies

TL;DR

The paper addresses the mismatch between growing datacenter traffic and fixed network topologies by surveying reconfigurable datacenter networks (RDCNs) enabled by optical circuit switches. It introduces a two-dimensional taxonomy (static vs dynamic, demand-oblivious vs demand-aware) and a formal evolving-graph model with a -timeslot reconfiguration abstraction to analyze bandwidth and latency taxes. It surveys representative RDCN designs (e.g., RotorNet, Sirius, Jupiter, ProjecToR, Cerberus) and discusses operational, deployment, and research challenges, complemented by expert video interviews. The work emphasizes topology engineering as a means to tailor network connectivity to traffic structure, potentially enhancing throughput and latency while outlining open problems across control planes, cross-layer integration, and scalable deployment. Overall, RDCNs offer a promising path to meet data-center demands through dynamic topologies, with significant implications for performance, cost, and incremental deployment strategies.

Abstract

With the popularity of cloud computing and data-intensive applications such as machine learning, datacenter networks have become a critical infrastructure for our digital society. Given the explosive growth of datacenter traffic and the slowdown of Moore's law, significant efforts have been made to improve datacenter network performance over the last decade. A particularly innovative solution is reconfigurable datacenter networks (RDCNs): datacenter networks whose topologies dynamically change over time, in either a demand-oblivious or a demand-aware manner. Such dynamic topologies are enabled by recent optical switching technologies and stand in stark contrast to state-of-the-art datacenter network topologies, which are fixed and oblivious to the actual traffic demand. In particular, reconfigurable demand-aware and 'self-adjusting' datacenter networks are motivated empirically by the significant spatial and temporal structures observed in datacenter communication traffic. This paper presents an overview of reconfigurable datacenter networks. In particular, we discuss the motivation for such reconfigurable architectures, review the technological enablers, and present a taxonomy that classifies the design space into two dimensions: static vs. dynamic and demand-oblivious vs. demand-aware. We further present a formal model and discuss related research challenges. Our article comes with complementary video interviews in which three leading experts, Manya Ghobadi, Amin Vahdat, and George Papen, share with us their perspectives on reconfigurable datacenter networks.

Paper Structure

This paper contains 11 sections, 5 figures.

Figures (5)

  • Figure 1: (a) Visualization of the spatial and temporal structures of a simple packet trace (middle) from a machine learning (ML) application, a popular convolutional neural network training job, on four GPUs. (top) the demand matrix. (bottom) an i.i.d trace with the same distribution. (b) A complexity map of seven real traces (colored circles) and four reference points placed at the corners of the map.
  • Figure 2: Examples of state-of-the-art static datacenter network topologies with eight racks, each containing two hosts. (a) A fat-tree, bi-regular topology with two types of switches, top-of-rack (ToR) switches (in blue) connecting to the hosts, and additional switches (in green) to increase throughout. (b) An expander-based, uni-regular topology, with only ToR switches. In both topologies, the diameter is four.
  • Figure 3: An example of an RDCN. (a) A topology with eight ToR switches and two optical spine switches. Each switch is configured with a matching between its ports (violet arcs) to create optical circuits between ToR pairs. (b) Examples of two types of reconfigurable optical switches that enable dynamic topologies through reconfiguration of the matching between ports.
  • Figure 4: (a) Classification of the design space along two dimensions, i) static vs. dynamic topologies and ii) demand-oblivious vs. demand-aware topologies, corresponding to four topology types. (b) Each topology type incurs different levels of bandwidth (BW) and latency (LT) taxes. (c) Different types of traffic are better served by different types of systems.
  • Figure 5: : A 3-regular evolving graph with eight nodes implemented in a ToR–matching–ToR (TMT) network example with eight ToR switches and three optical spine switches: one has a static matching configuration, and the other two have dynamic matching configurations (one demand-oblivious and one demand-aware). Each spine switch has eight input and output ports. (a) TMT architecture. (b) The matching configurations inside each of the three switches at time $t$. (c) The resulting 3-regular topology at time $t$, $G_t$.

Theorems & Definitions (1)

  • Definition 1: Evolving Graph