FlexCross: High-Speed and Flexible Packet Processing via a Crosspoint-Queued Crossbar

Klajd Zyla; Marco Liess; Thomas Wild; Andreas Herkersdorf

FlexCross: High-Speed and Flexible Packet Processing via a Crosspoint-Queued Crossbar

Klajd Zyla, Marco Liess, Thomas Wild, Andreas Herkersdorf

TL;DR

The design contains a crosspoint-queued crossbar that enables the execution of complex applications by forwarding incoming packets to the required processing engines in the specified sequence, and demonstrates that FlexCross outperforms state-of-the-art flexible packet-processing designs for different traffic loads and scenarios.

Abstract

The fast pace at which new online services emerge leads to a rapid surge in the volume of network traffic. A recent approach that the research community has proposed to tackle this issue is in-network computing, which means that network devices perform more computations than before. As a result, processing demands become more varied, creating the need for flexible packet-processing architectures. State-of-the-art approaches provide a high degree of flexibility at the expense of performance for complex applications, or they ensure high performance but only for specific use cases. In order to address these limitations, we propose FlexCross. This flexible packet-processing design can process network traffic with diverse processing requirements at over 100 Gbit/s on FPGAs. Our design contains a crosspoint-queued crossbar that enables the execution of complex applications by forwarding incoming packets to the required processing engines in the specified sequence. The crossbar consists of distributed logic blocks that route incoming packets to the specified targets and resolve contentions for shared resources, as well as memory blocks for packet buffering. We implemented a prototype of FlexCross in Verilog and evaluated it via cycle-accurate register-transfer level simulations. We also conducted test runs with real-world network traffic on an FPGA. The evaluation results demonstrate that FlexCross outperforms state-of-the-art flexible packet-processing designs for different traffic loads and scenarios. The synthesis results show that our prototype consumes roughly 21% of the resources on a Virtex XCU55 UltraScale+ FPGA.

FlexCross: High-Speed and Flexible Packet Processing via a Crosspoint-Queued Crossbar

TL;DR

Abstract

Paper Structure (16 sections, 6 figures, 1 table)

This paper contains 16 sections, 6 figures, 1 table.

Introduction
Related Work
Crossbar Switches
Flexible Packet Processing
Proposed Design
Overall Architecture
Processing Engines
Crossbar
Scheduler
Comparison to the State of the Art
Evaluation
Simulation Setup
Simulation Results
Test Results
Resource Usage
...and 1 more sections

Figures (6)

Figure 1: Block diagram of the architecture of FlexCross
Figure 2: Block diagram of the architecture of the Crossbar
Figure 3: Mean per-packet latency when receiving traffic associated with four flow types at different rates measured in PANIC, the CIOQ crossbar-based design, FlexPipe, and FlexCross with RR scheduling. The dashed lines show the minimum/maximum measured latency.
Figure 4: Throughput in % of the bandwidth when receiving traffic at different rates mapped to randomly generated task sequences achieved by PANIC, the CIOQ crossbar-based design, and FlexCross with three different schedulers
Figure 5: Mean per-packet latency when receiving traffic at different rates mapped to randomly generated task sequences measured in FlexCross with three different schedulers. The dashed lines show the minimum/maximum measured latency, while the orange indicates the mean processing delay.
...and 1 more figures

FlexCross: High-Speed and Flexible Packet Processing via a Crosspoint-Queued Crossbar

TL;DR

Abstract

FlexCross: High-Speed and Flexible Packet Processing via a Crosspoint-Queued Crossbar

Authors

TL;DR

Abstract

Table of Contents

Figures (6)