RIFO: Pushing the Efficiency of Programmable Packet Schedulers

Habib Mostafaei; Maciej Pacut; Stefan Schmid

RIFO: Pushing the Efficiency of Programmable Packet Schedulers

Habib Mostafaei, Maciej Pacut, Stefan Schmid

TL;DR

The paper tackles the resource intensity of Push-In-First-Out (PIFO) programmable schedulers by proposing Range-In First-Out (RIFO), a lightweight admission-based scheduler that uses only three registers and a single FIFO. RIFO relies on min–max normalization to compute a relative score $N(r_p) = \frac{r_p - Min}{Max - Min}$ and admits packets when this score exceeds queue utilization or when they fall within a guaranteed admission buffer, enabling effective policy realization with minimal state. Through large-scale NetBench simulations and a 650-line P4 Toﬁno prototype, RIFO delivers competitive flow completion times and substantial improvements for large flows (up to $4.91\times$) while achieving significant hardware efficiency (e.g., $2.54\times$ less SRAM than AIFO, $6.55\times$ less than SP-PIFO). The work demonstrates practical line-rate deployment, robust performance across workloads, and open-source artifacts to support reproducibility and further research in memory-efficient programmable schedulers.

Abstract

Packet scheduling is a fundamental networking task that recently received renewed attention in the context of programmable data planes. Programmable packet scheduling systems such as those based on Push-In First-Out (PIFO) abstraction enabled flexible scheduling policies, but are too resource-expensive for large-scale line rate operation. This prompted research into practical programmable schedulers (e.g., SP-PIFO, AIFO) approximating PIFO behavior on regular hardware. Yet, their scalability remains limited due to extensive number of memory operations. To address this, we design an effective yet resource-efficient packet scheduler, Range-In First-Out (RIFO), which uses only three mutable memory cells and one FIFO queue per PIFO queue. RIFO is based on multi-criteria decision-making principles and uses small guaranteed admission buffers. Our large-scale simulations in Netbench demonstrate that despite using fewer resources, RIFO generally achieves competitive flow completion times across all studied workloads, and is especially effective in workloads with a significant share of large flows, reducing flow completion time up to 4.91x in datamining workload compared to state-of-the-art solutions. Our prototype implementation using P4 on Tofino switches requires only 650 lines of code, is scalable, and runs at line rate.

RIFO: Pushing the Efficiency of Programmable Packet Schedulers

TL;DR

and admits packets when this score exceeds queue utilization or when they fall within a guaranteed admission buffer, enabling effective policy realization with minimal state. Through large-scale NetBench simulations and a 650-line P4 Toﬁno prototype, RIFO delivers competitive flow completion times and substantial improvements for large flows (up to

) while achieving significant hardware efficiency (e.g.,

less SRAM than AIFO,

less than SP-PIFO). The work demonstrates practical line-rate deployment, robust performance across workloads, and open-source artifacts to support reproducibility and further research in memory-efficient programmable schedulers.

Abstract

Paper Structure (23 sections, 3 equations, 13 figures, 4 tables, 1 algorithm)

This paper contains 23 sections, 3 equations, 13 figures, 4 tables, 1 algorithm.

Introduction
Contributions
Organization
Background
The Design of RIFO
Rationale behind RIFO
The Algorithm
Analyzing the Accuracy of RIFO
Quantile estimation by Min-Max normalization
Resetting the Min and Max samples
Data Plane Design and Implementation
Evaluation
Hardware Testbed
Packet-level Simulation
FCT Minimization with RIFO
...and 8 more sections

Figures (13)

Figure 1: The general architecture of RIFO.
Figure 2: The track range resetting mechanism of RIFO with $T$=50. We maintain three registers: corresponding to the value of $Min$, $Max$ and the counter of packets seen since the last reset. Initially, we set the counter to $0$, and the values $Min$ and $Max$ to the rank of the first packet. We increase the packet counter by one with each incoming packet, regardless of its admission. When the counter reaches $T$, we set the counter to $0$ and set the values of $Min$ and $Max$ to the rank of the incoming packet (the resetting action).
Figure 3: Example of RIFO admission with $T$=6 and $B$=3.
Figure 4: The influence of the resetting period $T$ on the Min and Max samples. The ground truth Min is 20, and the ground truth Max is 91. The sampled Min and Max values never deviate too much from the ground truth Min and Max, but they also do not reach the ground truth extremes of the underlying distribution.
Figure 5: Bandwidth split of without and with RIFO for scenarios with four flows when R(Flow 1)$~<$ R(Flow 2)$~<$ R(Flow 3)$~<$ R(Flow 4).
...and 8 more figures

RIFO: Pushing the Efficiency of Programmable Packet Schedulers

TL;DR

Abstract

RIFO: Pushing the Efficiency of Programmable Packet Schedulers

Authors

TL;DR

Abstract

Table of Contents

Figures (13)