Harvest: Adaptive Photonic Switching Schedules for Collective Communication in Scale-up Domains
Mahir Rahman, Samuel Joseph, Nihar Kodkani, Behnaz Arzani, Vamsi Addanki
TL;DR
Harvest tackles the problem of optimizing reconfigurable photonic interconnects for collective GPU communication by balancing reconfiguration delay $\alpha_r$ with congestion and propagation delays. It abstracts the schedule of a collective into step-wise communication patterns and uses a dynamic-programming framework combined with a topology-optimization subproblem to synthesize when and how to reconfigure. The key contributions include a formal model linking $\alpha$–$\beta$ cost, maximum concurrent flow, and reconfiguration costs; a general Harvest framework applicable to arbitrary collectives; and a polylogarithmic-time optimal schedule for Recursive Doubling AllReduce, plus extensive simulation and hardware-emulation validation showing substantial performance gains over static and per-step reconfiguration baselines. The work enables practical, offline synthesis of switching schedules that adapt to photonic technology parameters and demonstrates meaningful improvements in collective completion time with manageable synthesis overhead, offering a path toward adaptive photonic scale-up domains.
Abstract
As chip-to-chip silicon photonics gain traction for their bandwidth and energy efficiency, their circuit-switched nature raises a fundamental question for collective communication: when and how should the interconnect be reconfigured to realize these benefits? Establishing direct optical paths can reduce congestion and propagation delay, but each reconfiguration incurs non-negligible overhead, making naive per-step reconfiguration impractical. We present Harvest, a systematic approach for synthesizing topology reconfiguration schedules that minimize collective completion time in photonic interconnects. Given a collective communication algorithm and its fixed communication schedule, Harvest determines how the interconnect should evolve over the course of the collective, explicitly balancing reconfiguration delay against congestion and propagation delay. We reduce the synthesis problem into a dynamic program with an underlying topology optimization subproblem and show that the approach applies to arbitrary collective communication algorithms. Furthermore, we exploit the algorithmic structure of a well-known AllReduce algorithm (Recursive Doubling) to synthesize optimal reconfiguration schedules without using any optimizers. By parameterizing the formulation using reconfiguration delay, Harvest naturally adapts to various photonic technologies. Using packet-level and flow-level evaluations, as well as hardware emulation on commercial GPUs, we show that the schedules synthesized by Harvest significantly reduce collective completion time across multiple collective algorithms compared to static interconnects and reconfigure-every-step baselines.
