Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

Andreas Boltres; Niklas Freymuth; Patrick Jahnke; Holger Karl; Gerhard Neumann

Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

Andreas Boltres, Niklas Freymuth, Patrick Jahnke, Holger Karl, Gerhard Neumann

TL;DR

The paper argues that achieving sub-second routing in modern networks requires packet-level dynamics, showing that fluid-flow models fail to capture TCP-driven behavior. It introduces PackeRL, a packet-level RL environment built on ns-3 with a Gym-like interface, and presents two policies, M-Slim and FieldLines, designed for fast re-optimization and generalization across topologies. Empirical results demonstrate that packet-level training yields substantial gains over fluid-based approaches and static baselines, with M-Slim achieving sub-second re-optimization and FieldLines delivering rapid, scalable next-hop decisions. The work highlights PackeRL's versatility for training and evaluating routing policies and points to multipath, multi-objective, and distributed extensions as promising future directions.

Abstract

Finding efficient routes for data packets is an essential task in computer networking. The optimal routes depend greatly on the current network topology, state and traffic demand, and they can change within milliseconds. Reinforcement Learning can help to learn network representations that provide routing decisions for possibly novel situations. So far, this has commonly been done using fluid network models. We investigate their suitability for millisecond-scale adaptations with a range of traffic mixes and find that packet-level network models are necessary to capture true dynamics, in particular in the presence of TCP traffic. To this end, we present $\textit{PackeRL}$, the first packet-level Reinforcement Learning environment for routing in generic network topologies. Our experiments confirm that learning-based strategies that have been trained in fluid environments do not generalize well to this more realistic, but more challenging setup. Hence, we also introduce two new algorithms for learning sub-second Routing Optimization. We present $\textit{M-Slim}$, a dynamic shortest-path algorithm that excels at high traffic volumes but is computationally hard to scale to large network topologies, and $\textit{FieldLines}$, a novel next-hop policy design that re-optimizes routing for any network topology within milliseconds without requiring any re-training. Both algorithms outperform current learning-based approaches as well as commonly used static baseline protocols in scenarios with high-traffic volumes. All findings are backed by extensive experiments in realistic network conditions in our fast and versatile training and evaluation framework.

Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

TL;DR

Abstract

Learning Sub-Second Routing Optimization in Computer Networks requires Packet-Level Dynamics

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (21)