Vermilion: A Traffic-Aware Reconfigurable Optical Interconnect with Formal Throughput Guarantees
Vamsi Addanki, Chen Avin, Goran Dario Knabe, Giannis Patronas, Dimitris Syrivelis, Nikos Terzenidis, Paraskevas Bakopoulos, Ilias Marinos, Stefan Schmid
TL;DR
Vermilion tackles datacenter throughput by introducing a traffic-aware, fixed-duration periodic optical interconnect that eliminates multi-hop routing and congestion control through direct communication. It derives a short traffic-aware schedule using matrix rounding and augments it with a residual traffic cycle to cover remaining demand, achieving formal throughput guarantees and practical scalability. The authors prove a lower bound of $\frac{k-1}{k}\cdot(1-\Delta_r)$ for throughput and demonstrate up to $2.13\times$ throughput improvements and substantial flow-time reductions on realistic workloads, using existing rotor-like optical hardware. The work argues for the practicality of Vermilion in large-scale deployments and provides a concrete framework for traffic-aware periodic reconfigurations that can be updated dynamically as traffic evolves.
Abstract
The increasing gap between datacenter traffic volume and the capacity of electrical switches has driven the development of reconfigurable network designs utilizing optical circuit switching. Recent advancements, particularly those featuring periodic fixed-duration reconfigurations, have achieved practical end-to-end delays of just a few microseconds. However, current designs rely on multi-hop routing to enhance utilization, which can lead to a significant reduction in worst-case throughput and added overhead from congestion control and routing complexity. These factors pose significant operational challenges for the large-scale deployment of these technologies. We present Vermilion, a reconfigurable optical interconnect that breaks the throughput barrier of existing periodic reconfigurable networks, without the need for multi-hop routing -- thus eliminating congestion control and simplifying routing to direct communication. Vermilion adopts a traffic-aware approach while retaining the simplicity of periodic fixed-duration reconfigurations, similar to RotorNet. We formally establish throughput bounds for Vermilion, demonstrating that it achieves at least $33\%$ more throughput in the worst-case compared to existing designs. The key innovation of Vermilion is its short traffic-aware periodic schedule, derived using a matrix rounding technique. This schedule is then combined with a traffic-oblivious periodic schedule to efficiently manage any residual traffic. Our evaluation results support our theoretical findings, revealing significant performance gains for datacenter workloads.
