Table of Contents
Fetching ...

Advancements in Traffic Processing Using Programmable Hardware Flow Offload

Luca Deri, Alfredo Cardigliano, Francesco Fusco

TL;DR

The paper confronts the challenge of passively monitoring and inspecting high-speed network traffic under memory and processing constraints. It demonstrates how modern SmartNICs with hardware flow offload can shift substantial portions of flow-state handling and forwarding into hardware, reducing host CPU load and PCIe traffic for forwarding-focused probes. Through an augmented Cento implementation of the nProbe Cento NetFlow/IPFIX probe, the authors show significant gains in throughput and reliability for moderate active-flow counts, while acknowledging that memory-bound scenarios remain a bottleneck. The work provides practical guidance on deploying hardware offload with minimal code changes and highlights the practical implications for scalable network monitoring and security in high-speed networks.

Abstract

The exponential growth of data traffic and the increasing complexity of networked applications demand effective solutions capable of passively inspecting and analysing the network traffic for monitoring and security purposes. Implementing network probes in software using general-purpose operating systems has been made possible by advances in packet-capture technologies, such as kernel-bypass frameworks, and by multi-queue adapters designed to distribute the network workload in multi-core processors. Modern SmartNICs, in addition, have introduced stateful mechanisms to associate actions to network flows such as forwarding packets or updating traffic statistics for an individual flow. In this paper, we describe our experience in exploiting those functionalities in a modern network probe and we perform a detailed study of the performance characteristics under different scenarios. Compared to pure CPU-based solutions, SmartNICs with flow-offload technologies provide substantial benefits when implementing forwarding applications. However, the main limitation of having to keep large flow tables in the host memory remains largely unsolved for realistic monitoring and security applications.

Advancements in Traffic Processing Using Programmable Hardware Flow Offload

TL;DR

The paper confronts the challenge of passively monitoring and inspecting high-speed network traffic under memory and processing constraints. It demonstrates how modern SmartNICs with hardware flow offload can shift substantial portions of flow-state handling and forwarding into hardware, reducing host CPU load and PCIe traffic for forwarding-focused probes. Through an augmented Cento implementation of the nProbe Cento NetFlow/IPFIX probe, the authors show significant gains in throughput and reliability for moderate active-flow counts, while acknowledging that memory-bound scenarios remain a bottleneck. The work provides practical guidance on deploying hardware offload with minimal code changes and highlights the practical implications for scalable network monitoring and security in high-speed networks.

Abstract

The exponential growth of data traffic and the increasing complexity of networked applications demand effective solutions capable of passively inspecting and analysing the network traffic for monitoring and security purposes. Implementing network probes in software using general-purpose operating systems has been made possible by advances in packet-capture technologies, such as kernel-bypass frameworks, and by multi-queue adapters designed to distribute the network workload in multi-core processors. Modern SmartNICs, in addition, have introduced stateful mechanisms to associate actions to network flows such as forwarding packets or updating traffic statistics for an individual flow. In this paper, we describe our experience in exploiting those functionalities in a modern network probe and we perform a detailed study of the performance characteristics under different scenarios. Compared to pure CPU-based solutions, SmartNICs with flow-offload technologies provide substantial benefits when implementing forwarding applications. However, the main limitation of having to keep large flow tables in the host memory remains largely unsolved for realistic monitoring and security applications.
Paper Structure (12 sections, 5 figures, 1 table)

This paper contains 12 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: nProbe Cento Architecture Overview
  • Figure 2: Packet Lifecycle with Flow Offload
  • Figure 3: Inline vs Passive mode at 10 Mpps (80 Gbps) with DPI enabled. Cento does not loose packets even without flow offloading: with offloading enabled the CPU utilization decreases significantly.
  • Figure 4: Inline vs Passive mode at 89 Mpps (60 Gbps) with DPI enabled. Hardware offloading eliminates packet loss at 1 Million flows and reduces it by 25% with 10 Million flows. Offloading reduces the packet drop to zero at 1 Million distinct flows
  • Figure 5: DPI Overhead in Inline mode at 10 Mpps (80 Gbps)