Flowcut Switching: High-Performance Adaptive Routing with In-Order Delivery Guarantees
Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Abdulla Bataineh, David Hewson, Duncan Roweth, Torsten Hoefler
TL;DR
Flowcut switching addresses the challenge of maintaining in-order packet delivery under adaptive routing in data-center networks, particularly for RDMA RoCE and latency-sensitive transports. It achieves this by maintaining per-flow Flowcut state, creating new Flowcuts only when there are no in-flight packets, and employing a drainage mechanism with RTT-based signals to reroute congestion-affected flows. The paper presents three deployment variants (Full switch, Ingress-only, NIC-only) and demonstrates through simulations and Slingshot hardware experiments that Flowcut can yield up to 50% better flow completion times than ECMP and up to 40% better than Flowlet, while guaranteeing in-order delivery and tolerating failures with up to 5x improvement in tail scenarios. The approach is practical on commodity hardware with modest memory overhead and offers a flexible path to incremental deployment across switches or NICs, broadening the applicability of robust, in-order adaptive routing for modern data-center workloads.
Abstract
Network latency severely impacts the performance of applications running on supercomputers. Adaptive routing algorithms route packets over different available paths to reduce latency and improve network utilization. However, if a switch routes packets belonging to the same network flow on different paths, they might arrive at the destination out-of-order due to differences in the latency of these paths. For some transport protocols like TCP, QUIC, and RoCE, out-of-order (OOO) packets might cause large performance drops or significantly increase CPU utilization. In this work, we propose flowcut switching, a new adaptive routing algorithm that provides high-performance in-order packet delivery. Differently from existing solutions like flowlet switching, which are based on the assumption of bursty traffic and that might still reorder packets, flowcut switching guarantees in-order delivery under any network conditions, and is effective also for non-bursty traffic, as it is often the case for RDMA.
