Table of Contents
Fetching ...

Vermilion: A Traffic-Aware Reconfigurable Optical Interconnect with Formal Throughput Guarantees

Vamsi Addanki, Chen Avin, Goran Dario Knabe, Giannis Patronas, Dimitris Syrivelis, Nikos Terzenidis, Paraskevas Bakopoulos, Ilias Marinos, Stefan Schmid

TL;DR

Vermilion tackles datacenter throughput by introducing a traffic-aware, fixed-duration periodic optical interconnect that eliminates multi-hop routing and congestion control through direct communication. It derives a short traffic-aware schedule using matrix rounding and augments it with a residual traffic cycle to cover remaining demand, achieving formal throughput guarantees and practical scalability. The authors prove a lower bound of $\frac{k-1}{k}\cdot(1-\Delta_r)$ for throughput and demonstrate up to $2.13\times$ throughput improvements and substantial flow-time reductions on realistic workloads, using existing rotor-like optical hardware. The work argues for the practicality of Vermilion in large-scale deployments and provides a concrete framework for traffic-aware periodic reconfigurations that can be updated dynamically as traffic evolves.

Abstract

The increasing gap between datacenter traffic volume and the capacity of electrical switches has driven the development of reconfigurable network designs utilizing optical circuit switching. Recent advancements, particularly those featuring periodic fixed-duration reconfigurations, have achieved practical end-to-end delays of just a few microseconds. However, current designs rely on multi-hop routing to enhance utilization, which can lead to a significant reduction in worst-case throughput and added overhead from congestion control and routing complexity. These factors pose significant operational challenges for the large-scale deployment of these technologies. We present Vermilion, a reconfigurable optical interconnect that breaks the throughput barrier of existing periodic reconfigurable networks, without the need for multi-hop routing -- thus eliminating congestion control and simplifying routing to direct communication. Vermilion adopts a traffic-aware approach while retaining the simplicity of periodic fixed-duration reconfigurations, similar to RotorNet. We formally establish throughput bounds for Vermilion, demonstrating that it achieves at least $33\%$ more throughput in the worst-case compared to existing designs. The key innovation of Vermilion is its short traffic-aware periodic schedule, derived using a matrix rounding technique. This schedule is then combined with a traffic-oblivious periodic schedule to efficiently manage any residual traffic. Our evaluation results support our theoretical findings, revealing significant performance gains for datacenter workloads.

Vermilion: A Traffic-Aware Reconfigurable Optical Interconnect with Formal Throughput Guarantees

TL;DR

Vermilion tackles datacenter throughput by introducing a traffic-aware, fixed-duration periodic optical interconnect that eliminates multi-hop routing and congestion control through direct communication. It derives a short traffic-aware schedule using matrix rounding and augments it with a residual traffic cycle to cover remaining demand, achieving formal throughput guarantees and practical scalability. The authors prove a lower bound of for throughput and demonstrate up to throughput improvements and substantial flow-time reductions on realistic workloads, using existing rotor-like optical hardware. The work argues for the practicality of Vermilion in large-scale deployments and provides a concrete framework for traffic-aware periodic reconfigurations that can be updated dynamically as traffic evolves.

Abstract

The increasing gap between datacenter traffic volume and the capacity of electrical switches has driven the development of reconfigurable network designs utilizing optical circuit switching. Recent advancements, particularly those featuring periodic fixed-duration reconfigurations, have achieved practical end-to-end delays of just a few microseconds. However, current designs rely on multi-hop routing to enhance utilization, which can lead to a significant reduction in worst-case throughput and added overhead from congestion control and routing complexity. These factors pose significant operational challenges for the large-scale deployment of these technologies. We present Vermilion, a reconfigurable optical interconnect that breaks the throughput barrier of existing periodic reconfigurable networks, without the need for multi-hop routing -- thus eliminating congestion control and simplifying routing to direct communication. Vermilion adopts a traffic-aware approach while retaining the simplicity of periodic fixed-duration reconfigurations, similar to RotorNet. We formally establish throughput bounds for Vermilion, demonstrating that it achieves at least more throughput in the worst-case compared to existing designs. The key innovation of Vermilion is its short traffic-aware periodic schedule, derived using a matrix rounding technique. This schedule is then combined with a traffic-oblivious periodic schedule to efficiently manage any residual traffic. Our evaluation results support our theoretical findings, revealing significant performance gains for datacenter workloads.

Paper Structure

This paper contains 19 sections, 5 theorems, 5 equations, 14 figures, 1 algorithm.

Key Result

Theorem 1

The throughput of an ideal traffic-aware reconfigurable network is $1$ i.e., full-throughput for any traffic matrix if the reconfiguration delay is negligible.

Figures (14)

  • Figure 1: Existing designs based on periodic optical circuit-switching are oblivious to traffic patterns, requiring complex multi-hop routing and congestion control, which reduces throughput. Vermilion overcomes this limitation by introducing a few additional fixed-duration reconfigurations per period in a traffic-aware manner, while significantly simplifying both routing and congestion control.
  • Figure 2: The physical topology of a reconfigurable network consists of a set of nodes connected by optical circuit-switches arranged in a hierarchical Clos topology (e.g., leaf-spine). The circuit-switches are time-synchronized and rapidly reconfigure their circuits providing direct links between pairs of nodes in a periodic manner. We assume a control plane that defines the switching schedule for each switch.
  • Figure 4: Example traffic matrix for a topology with $4$ nodes, each with $1$ physical link.
  • Figure 5: Target emulated topology for traffic-aware vs oblivious periodic networks.
  • Figure 6: Derived periodic schedule for a traffic-aware vs oblivious periodic network.
  • ...and 9 more figures

Theorems & Definitions (7)

  • Definition 1: Traffic matrix
  • Definition 2: Throughput
  • Theorem 1: Ideal throughput of traffic-aware network
  • Theorem 2: Throughput under integer traffic matrices
  • Theorem 3: Throughput lower bound
  • Theorem 3: Ideal throughput of traffic-aware network
  • Theorem 3: Throughput under integer traffic matrices