Table of Contents
Fetching ...

Enabling Time-Aware Priority Traffic Management over Distributed FPGA Nodes

Alberto Scionti, Paolo Savio, Francesco Lubrano, Federico Stirano, Antonino Nespola, Olivier Terzo, Corrado De Sio, Luca Sterpone

TL;DR

The paper tackles deterministic, time-aware bandwidth allocation in distributed FPGA-based NICs by extending the open-source Corundum NIC with hardware-supported TDMA-like scheduling. It hardwareizes a time-aware TX queue scheduler (TQCR) and control flow via SCRs, ties the scheduling to Linux traffic classes through a 1-to-1 QDISC mapping, and relies on PTP IEEE 1588 clock synchronization to maintain a common time base across nodes. The authors demonstrate the approach on a 2x2 tile SoC-FPGA testbed, achieving notable bandwidth control where a highest-priority traffic class can secure a large fraction of the available transmission time, with observed latencies and jitter suitable for real-time-like workloads. The work highlights the practicality of open, Linux-integrated TSN-style traffic management in Smart-NICs and points to future enhancements such as larger-scale deployments, credit-based scheduling, and bare-metal routing optimizations to further improve efficiency and scalability.

Abstract

Network Interface Cards (NICs) greatly evolved from simple basic devices moving traffic in and out of the network to complex heterogeneous systems offloading host CPUs from performing complex tasks on in-transit packets. These latter comprise different types of devices, ranging from NICs accelerating fixed specific functions (e.g., on-the-fly data compression/decompression, checksum computation, data encryption, etc.) to complex Systems-on-Chip (SoC) equipped with both general purpose processors and specialized engines (Smart-NICs). Similarly, Field Programmable Gate Arrays (FPGAs) moved from pure reprogrammable devices to modern heterogeneous systems comprising general-purpose processors, real-time cores and even AI-oriented engines. Furthermore, the availability of high-speed network interfaces (e.g., SFPs) makes modern FPGAs a good choice for implementing Smart-NICs. In this work, we extended the functionalities offered by an open-source NIC implementation (Corundum) by enabling time-aware traffic management in hardware, and using this feature to control the bandwidth associated with different traffic classes. By exposing dedicated control registers on the AXI bus, the driver of the NIC can easily configure the transmission bandwidth of different prioritized queues. Basically, each control register is associated with a specific transmission queue (Corundum can expose up to thousands of transmission and receiving queues), and sets up the fraction of time in a transmission window which the queue is supposed to get access the output port and transmit the packets. Queues are then prioritized and associated to different traffic classes through the Linux QDISC mechanism. Experimental evaluation demonstrates that the approach allows to properly manage the bandwidth reserved to the different transmission flows.

Enabling Time-Aware Priority Traffic Management over Distributed FPGA Nodes

TL;DR

The paper tackles deterministic, time-aware bandwidth allocation in distributed FPGA-based NICs by extending the open-source Corundum NIC with hardware-supported TDMA-like scheduling. It hardwareizes a time-aware TX queue scheduler (TQCR) and control flow via SCRs, ties the scheduling to Linux traffic classes through a 1-to-1 QDISC mapping, and relies on PTP IEEE 1588 clock synchronization to maintain a common time base across nodes. The authors demonstrate the approach on a 2x2 tile SoC-FPGA testbed, achieving notable bandwidth control where a highest-priority traffic class can secure a large fraction of the available transmission time, with observed latencies and jitter suitable for real-time-like workloads. The work highlights the practicality of open, Linux-integrated TSN-style traffic management in Smart-NICs and points to future enhancements such as larger-scale deployments, credit-based scheduling, and bare-metal routing optimizations to further improve efficiency and scalability.

Abstract

Network Interface Cards (NICs) greatly evolved from simple basic devices moving traffic in and out of the network to complex heterogeneous systems offloading host CPUs from performing complex tasks on in-transit packets. These latter comprise different types of devices, ranging from NICs accelerating fixed specific functions (e.g., on-the-fly data compression/decompression, checksum computation, data encryption, etc.) to complex Systems-on-Chip (SoC) equipped with both general purpose processors and specialized engines (Smart-NICs). Similarly, Field Programmable Gate Arrays (FPGAs) moved from pure reprogrammable devices to modern heterogeneous systems comprising general-purpose processors, real-time cores and even AI-oriented engines. Furthermore, the availability of high-speed network interfaces (e.g., SFPs) makes modern FPGAs a good choice for implementing Smart-NICs. In this work, we extended the functionalities offered by an open-source NIC implementation (Corundum) by enabling time-aware traffic management in hardware, and using this feature to control the bandwidth associated with different traffic classes. By exposing dedicated control registers on the AXI bus, the driver of the NIC can easily configure the transmission bandwidth of different prioritized queues. Basically, each control register is associated with a specific transmission queue (Corundum can expose up to thousands of transmission and receiving queues), and sets up the fraction of time in a transmission window which the queue is supposed to get access the output port and transmit the packets. Queues are then prioritized and associated to different traffic classes through the Linux QDISC mechanism. Experimental evaluation demonstrates that the approach allows to properly manage the bandwidth reserved to the different transmission flows.

Paper Structure

This paper contains 7 sections, 6 figures, 1 table.

Figures (6)

  • Figure 1: Architectural overview of the main components of a NIC; receiving (right side) and transmitting (left side) data paths are separated.
  • Figure 2: General overview of the target distributed system using SoC FPGAs as compute nodes.
  • Figure 3: Internal organization of the NIC: time-aware scheduling is implemented in hardware, while priority mapping is enabled at the kernel level, where prio is the priority level (0 is the lowest priority) and TC are the traffic classes.
  • Figure 4: Signal acquisition on 4 SoC FPGA nodes showing the PTP synchronization: the grandmaster node (light blue) periodically exchanges PTP-frames with the other slave nodes (yellow, green and purple) which adjust their internal clocks to align to the grandmaster. According to the timescale, clocks' variation is in the range of $\pm 40$ w.r.t. the grandmaster.
  • Figure 5: Example of the QDISC command to create the mapping between 3 priorities, 3 traffic classes (TC) and 3 time-aware scheduled queues, on a specified interface (eth1).
  • ...and 1 more figures