Towards Sub-millisecond Latency and Guaranteed Bit Rates in 5G User Plane
Leonardo Alberro, Noura Limam, Raouf Boutaba
TL;DR
The paper tackles the challenge of delivering end-to-end QoS in 5G transport by proposing a QoS-aware data plane implemented in P4 on an Intel Tofino switch. It maps 3GPP 5QI profiles to four resource types (GBR, GBR*, Non-GBR, Non-GBR*) and combines per-flow metering, classification, strict-priority scheduling, and delay-aware queuing to provide per-flow guarantees and ultra-low latency under congestion. The authors implement the model on real hardware, validate sub-millisecond delays for delay-critical traffic, and demonstrate strong isolation of guaranteed flows with near-zero losses under varying load and high congestion. They also provide a formal framework for delay and throughput predictability and discuss practical hardware constraints and future improvements. Overall, this work offers a programmable, scalable foundation for end-to-end QoS enforcement in future 5G transport networks, enabling precise SLA enforcement and robust performance under dynamic traffic conditions.
Abstract
Next-generation services demand stringent Quality of Service (QoS) guarantees, such as per-flow bandwidth assurance, ultra-low latency, and traffic prioritization, posing significant challenges to 5G and beyond networks. As 5G network functions increasingly migrate to edge and central clouds, the transport layer becomes a critical enabler of end-to-end QoS compliance. However, traditional fixed-function infrastructure lacks the flexibility to support the diverse and dynamic QoS profiles standardized by 3GPP. This paper presents a QoS-aware data plane model for programmable transport networks, designed to provide predictable behavior and fine-grained service differentiation. The model supports all 3GPP QoS resource types and integrates per-flow metering, classification, strict priority scheduling, and delay-aware queuing. Implemented on off-the-shelf programmable hardware using P4 and evaluated on an Intel Tofino switch, our approach ensures per-flow bandwidth guarantees, sub-millisecond delay for delay-critical traffic, and resilience under congestion. Experimental results demonstrate that the model achieves microsecond-level latencies and near-zero packet loss for mission-critical flows, validating its suitability for future QoS-sensitive applications in 5G and beyond.
