Table of Contents
Fetching ...

Unlocking Diversity of Fast-Switched Optical Data Center Networks with Unified Routing

Jialong Li, Federico De Marchi, Yiming Lei, Raj Joshi, Balakrishnan Chandrasekaran, Yiting Xia

TL;DR

Fast-switched optical DCNs enable microsecond-scale time slices but create routing and data-loss challenges during circuit reconfiguration. URO offers a unified routing framework that combines offline latency-minimizing path computation with a run-time, time-aware on-switch engine, implemented on programmable switches to realize precise transmission timing. The approach yields substantial latency reductions and path-length savings across multiple hardware configurations and traffic traces, while remaining robust to failures and scalable in resource usage. This work enables cross-architecture interoperability and paves the way for sub-OWD and mixed-OWD schedules in practical optical DCN deployments.

Abstract

Optical data center networks (DCNs) are emerging as a promising solution for cloud infrastructure in the post-Moore's Law era, particularly with the advent of 'fast-switched' optical architectures capable of circuit reconfiguration at microsecond or even nanosecond scales. However, frequent reconfiguration of optical circuits introduces a unique challenge: in-flight packets risk loss during these transitions, hindering the deployment of many mature optical hardware designs due to the lack of suitable routing solutions. In this paper, we present Unified Routing for Optical networks (URO), a general routing framework designed to support fast-switched optical DCNs across various hardware architectures. URO combines theoretical modeling of this novel routing problem with practical implementation on programmable switches, enabling precise, time-based packet transmission. Our prototype on Intel Tofino2 switches achieves a minimum circuit duration of 2us, ensuring end-to-end, loss-free application performance. Large-scale simulations using production DCN traffic validate URO's generality across different hardware configurations, demonstrating its effectiveness and efficient system resource utilization.

Unlocking Diversity of Fast-Switched Optical Data Center Networks with Unified Routing

TL;DR

Fast-switched optical DCNs enable microsecond-scale time slices but create routing and data-loss challenges during circuit reconfiguration. URO offers a unified routing framework that combines offline latency-minimizing path computation with a run-time, time-aware on-switch engine, implemented on programmable switches to realize precise transmission timing. The approach yields substantial latency reductions and path-length savings across multiple hardware configurations and traffic traces, while remaining robust to failures and scalable in resource usage. This work enables cross-architecture interoperability and paves the way for sub-OWD and mixed-OWD schedules in practical optical DCN deployments.

Abstract

Optical data center networks (DCNs) are emerging as a promising solution for cloud infrastructure in the post-Moore's Law era, particularly with the advent of 'fast-switched' optical architectures capable of circuit reconfiguration at microsecond or even nanosecond scales. However, frequent reconfiguration of optical circuits introduces a unique challenge: in-flight packets risk loss during these transitions, hindering the deployment of many mature optical hardware designs due to the lack of suitable routing solutions. In this paper, we present Unified Routing for Optical networks (URO), a general routing framework designed to support fast-switched optical DCNs across various hardware architectures. URO combines theoretical modeling of this novel routing problem with practical implementation on programmable switches, enabling precise, time-based packet transmission. Our prototype on Intel Tofino2 switches achieves a minimum circuit duration of 2us, ensuring end-to-end, loss-free application performance. Large-scale simulations using production DCN traffic validate URO's generality across different hardware configurations, demonstrating its effectiveness and efficient system resource utilization.

Paper Structure

This paper contains 21 sections, 1 equation, 45 figures, 2 tables, 1 algorithm.

Figures (45)

  • Figure 1: Illustration of an optical DCN.
  • Figure 2: URO tail FCTs assuming empty queues vs. 80 packets per queue with 30% ToR-to-ToR link utilization.
  • Figure 3: Illustration of the URO backtracking algorithm. The packet arrival time slice is $t=0$. Optical circuits are denoted as edges with their available time slices annotated on top. The destination calls Routing to find the earliest last hops ($A$ and $B$ with $t=5$), which then call Subpath to find the shortest feasible path through them from the source ($S\rightarrow B\rightarrow D$). Paths violating various constraints (see explanations in red) are pruned from the backtracking search.
  • Figure 4: Illustration of the URO switch system ($\S$\ref{['sec:implementation']}). The current time slice is 2 and the corresponding active queue is 0. (a) URO paths ($\S$\ref{['sec:alg_design']}). (b) $ToR_1$'s lookup table ($\S$\ref{['sec:lookup_table']}) corresponding to the paths, and queuing delay estimation for packet rerouting ($\S$\ref{['sec:rerouting']}). (c) Calendar queues for packet buffering ($\S$\ref{['sec:calendar_queue']}). (d) $ToR_1$'s optimized lookup table for rerouting without packet recirculation ($\S$\ref{['sec:rerouting']}).
  • Figure 5: ToR-to-ToR delay with different packet sizes.
  • ...and 40 more figures

Theorems & Definitions (3)

  • proof
  • proof
  • proof