Table of Contents
Fetching ...

Integrated Topology and Traffic Engineering for Reconfigurable Datacenter Networks

Chen Griner, Chen Avin

TL;DR

This work addresses maximizing RDCN throughput by unifying topology engineering with traffic scheduling under a Demand Completion Time (DCT) framework. It introduces three system classes—BvN-sys (demand-aware), rr-sys (demand-oblivious), and comp-sys (hybrid)—and proves that comp-sys can achieve superior throughput by decomposing traffic into components handled by the best-suited topology. The Pivot algorithm provides a principled method to partition demand between BvN and round-robin components, with analytical bounds and a case study on the M(v,u) matrix family showing substantial gains. Empirical results using realistic traffic models corroborate the theoretical advantages, demonstrating up to ~25% throughput improvement over the state-of-the-art designs. The paper offers a formal, tractable model for co-design of topology and traffic in RDCNs and highlights open questions for worst-case analysis and extension to more complex spine configurations.

Abstract

The state-of-the-art topologies of datacenter networks are fixed, based on electrical switching technology, and by now, we understand their throughput and cost well. For the past years, researchers have been developing novel optical switching technologies that enable the emergence of reconfigurable datacenter networks (RDCNs) that support dynamic psychical topologies. The art of network design of dynamic topologies, i.e., 'Topology Engineering,' is still in its infancy. Different designs offer distinct advantages, such as faster switch reconfiguration times or demand-aware topologies, and to date, it is yet unclear what design maximizes the throughput. This paper aims to improve our analytical understanding and formally studies the throughput of reconfigurable networks by presenting a general and unifying model for dynamic networks and their topology and traffic engineering. We use our model to study demand-oblivious and demand-aware systems and prove new upper bounds for the throughput of a system as a function of its topology and traffic schedules. Next, we offer a novel system design that combines both demand-oblivious and demand-aware schedules, and we prove its throughput supremacy under a large family of demand matrices. We evaluate our design numerically for sparse and dense traffic and show that our approach can outperform other designs by up to 25% using common network parameters.

Integrated Topology and Traffic Engineering for Reconfigurable Datacenter Networks

TL;DR

This work addresses maximizing RDCN throughput by unifying topology engineering with traffic scheduling under a Demand Completion Time (DCT) framework. It introduces three system classes—BvN-sys (demand-aware), rr-sys (demand-oblivious), and comp-sys (hybrid)—and proves that comp-sys can achieve superior throughput by decomposing traffic into components handled by the best-suited topology. The Pivot algorithm provides a principled method to partition demand between BvN and round-robin components, with analytical bounds and a case study on the M(v,u) matrix family showing substantial gains. Empirical results using realistic traffic models corroborate the theoretical advantages, demonstrating up to ~25% throughput improvement over the state-of-the-art designs. The paper offers a formal, tractable model for co-design of topology and traffic in RDCNs and highlights open questions for worst-case analysis and extension to more complex spine configurations.

Abstract

The state-of-the-art topologies of datacenter networks are fixed, based on electrical switching technology, and by now, we understand their throughput and cost well. For the past years, researchers have been developing novel optical switching technologies that enable the emergence of reconfigurable datacenter networks (RDCNs) that support dynamic psychical topologies. The art of network design of dynamic topologies, i.e., 'Topology Engineering,' is still in its infancy. Different designs offer distinct advantages, such as faster switch reconfiguration times or demand-aware topologies, and to date, it is yet unclear what design maximizes the throughput. This paper aims to improve our analytical understanding and formally studies the throughput of reconfigurable networks by presenting a general and unifying model for dynamic networks and their topology and traffic engineering. We use our model to study demand-oblivious and demand-aware systems and prove new upper bounds for the throughput of a system as a function of its topology and traffic schedules. Next, we offer a novel system design that combines both demand-oblivious and demand-aware schedules, and we prove its throughput supremacy under a large family of demand matrices. We evaluate our design numerically for sparse and dense traffic and show that our approach can outperform other designs by up to 25% using common network parameters.
Paper Structure (32 sections, 20 theorems, 83 equations, 8 figures, 2 algorithms)

This paper contains 32 sections, 20 theorems, 83 equations, 8 figures, 2 algorithms.

Key Result

Theorem 1

[Lower bound for rr-sys system DCT] Let $\mathcal{_}{\mathrm{pkg}}$A_pkg$$ be any traffic scheduling algorithm for rr-sys that generates a complete schedule. Let $P \in P_{\pi}$ be any derangement permutation matrix, then we have

Figures (8)

  • Figure 1: A schematic view of our simplified network model. A two-layer leaf-spine architecture. The spine layer consists of a single switch that is able to adapt and change its matching. The dashed lines represent unidirectional links.
  • Figure 2: An example of demand and schedules, (a) Double stochastic demand matrix $M$. The values in each cell is 0, $\frac{1}{3}$, or $\frac{2}{3}$ depend on the number of colors in the cell. (b) a simple BvN-decomposition-based schedules which decompose $M$ into three permutation matrices $(P_1, P_2, P_3)$ (that become the topology schedule $(C_1, C_2, C_3)$) and use only a single hop routing, i.e., $M^{(dl)}$. (c) a round-robin-based topology schedule $(C_1, C_2, C_3, C_4)$ that is predetermined (and oblivious to the demand) and a traffic schedule that also uses 2-hops paths, $M^{(1h)}$, and $M^{(2h)}$.
  • Figure 3: The demand completion time and throughput for the Family of $M(v)=M(v,0)$ matrices.
  • Figure 4: (a) The throughput of the three systems on matrices generated by the traffic model. Where comp-sys was tested with Pivot. Note that comp-sys:Pivot is slightly moved up. (b) Four different measures of the demand matrices from (a). (c) Pivot division of the demand to $M^{\textsc{BvN}\xspace}$ and $M^{\textsc{rr}}$.
  • Figure 5: The throughput on a dense and a sparse workload, with a change in the large flow load $c_l$
  • ...and 3 more figures

Theorems & Definitions (53)

  • Definition 1: Total traffic matrix
  • Definition 2: Complete schedules
  • Definition 3: Matrix completion time
  • Definition 4: Algorithms completion time
  • Definition 5: System DCT
  • Definition 6: The BvN-System, BvN-sys
  • Definition 7: The Round-Robin System, rr-sys
  • Theorem 1
  • Corollary 2: Lower bound for $M(v)$ on rr-sys
  • Lemma 2
  • ...and 43 more